Русский
Русский
English
Статистика
Реклама

On commutativity of addition

Does an assembly change, if we write (b + a) instead (a + b)?
Let's check out.

Let's write:
__int128 add1(__int128 a, __int128 b) {    return b + a;}

and compile it with risc-v gcc 8.2.0:

add1(__int128, __int128):
.LFB0:
.cfi_startproc
add a0,a2,a0
sltu a2,a0,a2
add a1,a3,a1
add a1,a2,a1
ret


Now write the following:

__int128 add1(__int128 a, __int128 b) {    return a + b;}

And get:

add1(__int128, __int128):
.LFB0:
.cfi_startproc
mv a5,a0
add a0,a0,a2
sltu a5,a0,a5
add a1,a1,a3
add a1,a5,a1
ret

The difference is obvious.

Now do the same using clang (rv64gc trunk). In both cases we get the same result:
add1(__int128, __int128): # @add1(__int128, __int128)
add a1, a1, a3
add a0, a0, a2
sltu a2, a0, a2
add a1, a1, a2
ret

The result is the same we got from gcc in the first case. Compilers are smart now, but not so smart yet.

Let's try to find out, what happened here and why. Arguments of a function __int128 add1(__int128 a, __int128 b) are passed through registers a0-a3 in the following order: a0 is a low word of a operand, a1 is a high word of a, a2 is a low word of b and a1 is the high word of b. The result is returned in the same order, with a low word in a0 and a high word in a1.

Then high words of two arguments are added and the result is located in a1, and for low words, the result is located in a0. Then the result is compared against a2, i.e. the low word of b operand. It is necessary to find out if an overflow has happened at an adding operation. If an overflow has happened, the result is less than any of the operands. Because the operand in a0 does not exist now, the a2 register is used for comparison. If a0 < a2, the overflow has happened, and a2 is set to 1, and to 0 otherwise. Then this bit is added to the hight word of the result. Now the result is located in (a1, a0).

Completely similar text is generated by Clang (rv32gc trunk) for the 32-bit core, if the function has 64-bit arguments and the result:

long long add1(long long a, long long b) {    return a + b;}

The assembler:
add1(long long, long long): # @add1(long long, long long)
add a1, a1, a3
add a0, a0, a2
sltu a2, a0, a2
add a1, a1, a2
ret

There is absolutely the same code. Unfortunately, a type __int128 is not supported by compilers for 32-bit architecture.

Here there is a slight possibility for the core microarchitecture optimization. Considering the RISC-V architecture standard, a microarchitecture can (but not has to) detect instruction pairs (MULH[[S]U] rdh, rs1, rs2; MUL rdl, rs1, rs2) and (DIV[U] rdq, rs1, rs2; REM[U] rdr, rs1, rs2) to process them as one instruction. Similarly, it is possible to detect the pair (add rdl, rs1, rs2; sltu rdh, rdl, rs1/rs2) and immediately set the overflow bit in the rdh register.
Источник: habr.com
К списку статей
Опубликовано: 27.04.2021 22:20:31
0

Сейчас читают

Комментариев (0)
Имя
Электронная почта

Assembler

Компиляторы

C

Llvm

Risc-v

Optomization

Категории

Последние комментарии

  • Имя: Макс
    24.08.2022 | 11:28
    Я разраб в IT компании, работаю на арбитражную команду. Мы работаем с приламы и сайтами, при работе замечаются постоянные баны и лаги. Пацаны посоветовали сервис по анализу исходного кода,https://app Подробнее..
  • Имя: 9055410337
    20.08.2022 | 17:41
    поможем пишите в телеграм Подробнее..
  • Имя: sabbat
    17.08.2022 | 20:42
    Охренеть.. это просто шикарная статья, феноменально круто. Большое спасибо за разбор! Надеюсь как-нибудь с тобой связаться для обсуждений чего-либо) Подробнее..
  • Имя: Мария
    09.08.2022 | 14:44
    Добрый день. Если обладаете такой информацией, то подскажите, пожалуйста, где можно найти много-много материала по Yggdrasil и его уязвимостях для написания диплома? Благодарю. Подробнее..
© 2006-2024, personeltest.ru