| Both sides previous revisionPrevious revision | |
| en:multiasm:paarm:chapter_5_6 [2026/02/27 16:23] – jtokarz | en:multiasm:paarm:chapter_5_6 [2026/06/21 21:19] (current) – pczekalski |
|---|
| ===== Arithmetical instructions ===== | ===== Arithmetical instructions ===== |
| |
| All arithmetical operations are performed directly on the processor's registers. The most common instructions are the same ones we use every day to add two or more values together, subtract one value from another, multiply two values, or divide one value by another. In ARM assembly, the <fc #800000>ADD</fc>, <fc #800000>SUB</fc>, <fc #800000>MUL</fc>, and <fc #800000>DIV</fc> instructions perform the same function. All these instructions and other arithmetic instructions require that both values be placed in the registers. At this moment, we assume that all values in the registers are preloaded and ready to use, as demonstrated in the following instruction examples. | All arithmetical operations are performed directly on the processor's registers. The most common instructions are the same ones we use every day to add two or more values together, subtract one value from another, multiply two values, or divide one value by another. In ARM assembly, the <fc #800000>ADD</fc>, <fc #800000>SUB</fc>, <fc #800000>MUL</fc>, and <fc #800000>DIV</fc> instructions perform the same function. All these instructions, including other arithmetic instructions, require that both values be placed in the registers. At this moment, we assume that all values in the registers are preloaded and ready to use, as demonstrated in the following instruction examples. |
| |
| ''<fc #800000>ADD</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #008000>X2</fc> <fc #6495ed>@ adds the X1 and X2 values X0= X1 + X2</fc>'' | ''<fc #800000>ADD</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #008000>X2</fc> <fc #6495ed>@ adds the X1 and X2 values X0= X1 + X2</fc>'' |
| |
| All these arithmetical instructions have additional options, such as an optional shift of the second source operand. The <fc #800000>DIV</fc> instruction must have a prefix of S for Signed (<fc #800000>SDIV</fc>) or U for Unsigned (<fc #800000>UDIV</fc>) divide operations. Prefix S preserves the sign of the result, depending on the signs used for the operands. The prefix U always returns a positive value. | All these arithmetical instructions have additional options, such as an optional shift of the second source operand. The <fc #800000>DIV</fc> instruction must have a prefix of S for Signed (<fc #800000>SDIV</fc>) or U for Unsigned (<fc #800000>UDIV</fc>) divide operations. Prefix S preserves the sign of the result, depending on the signs used for the operands. The prefix U always returns a positive value. |
| Some instructions can be combined to achieve better computational performance. In such cases, the first arithmetic operation is performed on the second source register, and then the instruction's operation is performed. Such instructions are: ''<fc #800000>MADD</fc>'', ''<fc #800000>MSUB</fc>'', ''<fc #800000>SMADDL</fc>'', ''<fc #800000>SMSUBL</fc>'', ''<fc #800000>UMADDL</fc>'' and ''<fc #800000>UMSUBL</fc>''. Basically, all the listed instructions are ''<fc #800000>MADD</fc>'' and ''<fc #800000>MSUB</fc>'', but with different options. Let's look at ''<fc #800000>MADD</fc>'' and ''<fc #800000>MSUB</fc>'' instructions. | Some instructions can be combined to achieve better computational performance. In such cases, the first arithmetic operation is performed on the second source register, and then the instruction's operation is performed on the result. Such instructions are: ''<fc #800000>MADD</fc>'', ''<fc #800000>MSUB</fc>'', ''<fc #800000>SMADDL</fc>'', ''<fc #800000>SMSUBL</fc>'', ''<fc #800000>UMADDL</fc>'' and ''<fc #800000>UMSUBL</fc>''. Basically, all the listed instructions are ''<fc #800000>MADD</fc>'' and ''<fc #800000>MSUB</fc>'', but with different options. Let's look at ''<fc #800000>MADD</fc>'' and ''<fc #800000>MSUB</fc>'' instructions. |
| |
| ''<fc #800000>MADD</fc> <fc #008000>X1</fc>, <fc #008000>X2</fc>, <fc #008000>X3</fc>, <fc #008000>X4</fc> <fc #6495ed>@ X1 = X4 + X2*X3</fc>''\\ | ''<fc #800000>MADD</fc> <fc #008000>X1</fc>, <fc #008000>X2</fc>, <fc #008000>X3</fc>, <fc #008000>X4</fc> <fc #6495ed>@ X1 = X4 + X2*X3</fc>''\\ |
| Before performing addition or subtraction, first multiply the registers X2 and X3 (the second and third operands given to the instruction), and then perform the addition or subtraction. The prefixes S and U define whether the result can be a signed value or only a positive value (unsigned value). The postfix L, like <fc #800000>SMSUBL</fc> or <fc #800000>UMADDL</fc>, specifies that only 32-bit register values are used when multiplying the second and third operands. The remaining operands are 64-bit register values. | Before performing addition or subtraction, first multiply the registers X2 and X3 (the second and third operands given to the instruction), and then perform the addition or subtraction. The prefixes S and U define whether the result can be a signed value or only a positive value (unsigned value). The postfix L, like <fc #800000>SMSUBL</fc> or <fc #800000>UMADDL</fc>, specifies that only 32-bit register values are used when multiplying the second and third operands. The remaining operands are 64-bit register values. |
| |
| The next ARM version, ARMv8.3, processors are built by default with a PAC (Pointer Authentication) system. Earlier architectures must have been checked to see whether the PAC system is available. This enables the system to protect against pointer errors or corruption and adds additional arithmetic instructions. The system's security level can be significantly increased by marking and checking pointers. PAC adds a signature to the pointer, allowing verification that it has not been tampered with before use. As a result, additional postfixes for the ''<fc #800000>ADD</fc>'' instruction, such as ''<fc #800000>ADDG</fc>'' and ''<fc #800000>ADDPT</fc>'', are added. While these operations are less common in simple programs, they are powerful tools when writing optimised and secure code. | The next ARM version, ARMv8.3, processors are built with a PAC (Pointer Authentication) system by default. Earlier architectures must have been checked to see whether the PAC system is available. This enables the system to protect against pointer errors or corruption and adds additional arithmetic instructions. The system's security level can be significantly increased by marking and checking pointers. PAC adds a signature to the pointer, allowing verification that it has not been tampered with before use. As a result, additional postfixes for the ''<fc #800000>ADD</fc>'' instruction, such as ''<fc #800000>ADDG</fc>'' and ''<fc #800000>ADDPT</fc>'', are added. While these operations are less common in simple programs, they are powerful tools when writing optimised and secure code. |
| The ''<fc #800000>ADDG</fc>'' instruction means ''<fc #800000>ADD</fc>'' with Tag and is focused on pointers. The Tag is used to mark the pointer with a small identifier, allowing detection of pointer corruption or incorrect usage, among other options. Primarily, these instructions are used to authenticate pointers and ensure memory safety, for example, by tracking the boundaries of memory regions. | The ''<fc #800000>ADDG</fc>'' instruction means ''<fc #800000>ADD</fc>'' with Tag and is focused on pointers. The Tag is used to mark the pointer with a small identifier, allowing detection of pointer corruption or misuse, among other uses. Primarily, these instructions are used to authenticate pointers and ensure memory safety, for example, by tracking the boundaries of memory regions. |
| |
| For example: ''<fc #800000>ADDG</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #ffa500>#16</fc>, <fc #ffa500>#5</fc>''\\ | For example: ''<fc #800000>ADDG</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #ffa500>#16</fc>, <fc #ffa500>#5</fc>''\\ |
| ===== Instruction options ===== | ===== Instruction options ===== |
| |
| All assembly language types use similar mnemonics for arithmetic operations (some may require additional suffixes to identify some options for the instruction). A32 assembly instructions have specific suffixes to make commands executed conditionally, and those four most significant bits for many instructions give this ability. Unfortunately, there is no such option for A64, but there are special conditional instructions that we will describe later. | All assembly language types use similar mnemonics for arithmetic operations (some may require additional suffixes to identify specific instruction options). A32 assembly instructions have specific suffixes to make commands executed conditionally, and those four most significant bits for many instructions give this ability. Unfortunately, there is no such option for A64, but there are special conditional instructions that we will describe later. |
| We looked at a straightforward instruction and its exact machine code in the previous section. Examining machine code for each instruction is a perfect way to learn all the available options and restrictions. To help understand and read the instruction set documentation, another example of the <fc #800000>''ADD''</fc> instruction in the A64 instruction set will be provided. | We looked at a straightforward instruction and its exact machine code in the previous section. Examining machine code for each instruction is a perfect way to learn all the available options and restrictions. To help understand and read the instruction set documentation, another example of the <fc #800000>''ADD''</fc> instruction in the A64 instruction set will be provided. |
| The <fc #800000>''ADD''</fc> instruction: let's first look at the assembler instruction that adds two registers and stores the result in a third register. | The <fc #800000>''ADD''</fc> instruction: let's first look at the assembler instruction that adds two registers and stores the result in a third register. |
| <fc #800000>''ADD''</fc> <fc #008000>''X0''</fc>, <fc #008000>''X1''</fc>, <fc #008000>''X2 '' </fc> <fc #6495ed>''@X0 = X1 + X2''</fc> | <fc #800000>''ADD''</fc> <fc #008000>''X0''</fc>, <fc #008000>''X1''</fc>, <fc #008000>''X2 '' </fc> <fc #6495ed>''@X0 = X1 + X2''</fc> |
| |
| We need to look at the instruction set documentation to determine the possible options for this instruction. The documentation lists three main differences between the <fc #800000>''ADD''</fc> instructions. Despite that, for the data manipulation instruction, the ‘S’ suffix can be added to update the status flags in the processor Status Register. | We need to review the instruction set documentation to determine the available options for this instruction. The documentation lists three main differences between the <fc #800000>''ADD''</fc> instructions. Despite that, for the data manipulation instruction, the ‘S’ suffix can be added to update the status flags in the processor Status Register. |
| |
| |
| * **Rm** = <fc #008000>W5</fc> <fc #6495ed>@ pointer to the Second operand of the provided operands, which will be extended to 64 bits</fc> | * **Rm** = <fc #008000>W5</fc> <fc #6495ed>@ pointer to the Second operand of the provided operands, which will be extended to 64 bits</fc> |
| |
| We already know that the ‘sf’ bit identifies the length of the data (32 or 64 bits). The main difference between these two instructions is in the ‘S’ bit. The same is in the name of the instruction. The ‘S’ bit is meant to signal to the processor that the status bits should be updated after instruction execution. These status bits are crucial for conditions. The 30th ‘op’ bit and ‘opt’ bits are fixed and not used for this instruction. The three option bits (13th to 15th) extend the operation. These bits are used to extend the second source (Rm) operand. This is handy when the source operands differ in length, such as when the first operand is 16-bit wide and the second is 8-bit wide. The second register must be extended to maintain the data alignment. | We already know that the ‘sf’ bit identifies the length of the data (32 or 64 bits). The main difference between these two instructions is in the ‘S’ bit. The same is in the name of the instruction. The ‘S’ bit is meant to signal to the processor that the status bits should be updated after instruction execution. These status bits are crucial for conditions. The 30th ‘op’ bit and ‘opt’ bits are fixed and not used for this instruction. The three option bits (13th to 15th) extend the operation. These bits are used to extend the second source (Rm) operand. This is handy when the source operands differ in length, such as when the first operand is 16-bit and the second is 8-bit. The second register must be extended to maintain the data alignment. |
| Overall, there are three bits: 8 different options to extend the second source operand. The table below explains all these options. Let's look only at those options; the bit values are irrelevant for learning the assembler. | Overall, there are three bits: 8 different options to extend the second source operand. The table below explains all these options. Let's look only at those options; the bit values are irrelevant for learning the assembler. |
| |
| Load and store instructions have the most additional options, more than for the arithmetical and logical operations. For example, the ''<fc #800000>LDADD</fc>'' instruction combines a load and an arithmetic operation. This is a part of the so-called atomic operations. The ''<fc #800000>LDADD</fc>'' instruction atomically loads a value from memory, adds the value held in a register, and finally stores the result back in memory at a different location. NOTE that the registers used in this instruction must not be the same. This is something like what would be for the x86 architecture. Unfortunately, no other arithmetic operations are available besides addition.\\ | Load and store instructions have the most additional options, more than for the arithmetical and logical operations. For example, the ''<fc #800000>LDADD</fc>'' instruction combines a load and an arithmetic operation. This is a part of the so-called atomic operations. The ''<fc #800000>LDADD</fc>'' instruction atomically loads a value from memory, adds the value held in a register, and finally stores the result back in memory at a different location. NOTE that the registers used in this instruction must not be the same. This is something like what would be for the x86 architecture. Unfortunately, no other arithmetic operations are available besides addition.\\ |
| ''<fc #800000>LDADD</fc> <fc #008000>W1</fc>, <fc #008000>W2</fc>, [<fc #008000>X0</fc>]'' \\ | ''<fc #800000>LDADD</fc> <fc #008000>W1</fc>, <fc #008000>W2</fc>, [<fc #008000>X0</fc>]'' \\ |
| The register ''<fc #008000>X0</fc>'' holds a memory address. The data/value is loaded into the ''<fc #008000>W2</fc>'' register, and then the value is added to the ''<fc #008000>W1</fc>'' register value, after which the new value ''[<fc #008000>X0</fc>]+<fc #008000>W1</fc>'' is stored back into memory at the exact location pointed by ''[<fc #008000>X0</fc>]''. Basically, the ''<fc #008000>W2</fc>'' register now holds the ''[<fc #008000>X0</fc>]''- pointed data that was present before the ''<fc #008000>W1</fc>'' value was added. Similar instructions are available to perform atomic logic operations on the memory data. | The register ''<fc #008000>X0</fc>'' holds a memory address. The data/value is loaded into the ''<fc #008000>W2</fc>'' register. Then the value is added to the ''<fc #008000>W1</fc>'' register value, after which the new value ''[<fc #008000>X0</fc>]+<fc #008000>W1</fc>'' is stored back into memory at the exact location pointed by ''[<fc #008000>X0</fc>]''. Basically, the ''<fc #008000>W2</fc>'' register now holds the ''[<fc #008000>X0</fc>]''- pointed data that was present before the ''<fc #008000>W1</fc>'' value was added. Similar instructions are available to perform atomic logic operations on the memory data. |
| |
| To copy content from one register to another, the ''<fc #800000>MOV</fc>'' instruction is used. The ''<fc #800000>FMOV</fc>'' instruction can also copy floating-point values. These instructions allow typecasting a floating-point value to an integer and vice versa. Here are some independent instruction examples\\ | To copy content from one register to another, the ''<fc #800000>MOV</fc>'' instruction is used. The ''<fc #800000>FMOV</fc>'' instruction can also copy floating-point values. These instructions allow typecasting a floating-point value to an integer and vice versa. Here are some independent instruction examples\\ |
| ''<fc #800000>MOV</fc> <fc #008000>X1</fc>, <fc #008000>X0</fc> <fc #6495ed>@ X1 = X0 (64 bit register copy)</fc>''\\ | ''<fc #800000>MOV</fc> <fc #008000>X1</fc>, <fc #008000>X0</fc> <fc #6495ed>@ X1 = X0 (64-bit register copy)</fc>''\\ |
| ''<fc #800000>MOV</fc> <fc #008000>W1</fc>, <fc #008000>W0</fc> <fc #6495ed>@ W1 = W0 (32 bit register copy)</fc>''\\ | ''<fc #800000>MOV</fc> <fc #008000>W1</fc>, <fc #008000>W0</fc> <fc #6495ed>@ W1 = W0 (32-bit register copy)</fc>''\\ |
| ''<fc #800000>FMOV</fc> <fc #008000>S1</fc>, <fc #008000>S0</fc> <fc #6495ed>@ float → float (32-bit floating-point copy between vector registers)</fc>''\\ | ''<fc #800000>FMOV</fc> <fc #008000>S1</fc>, <fc #008000>S0</fc> <fc #6495ed>@ float → float (32-bit floating-point copy between vector registers)</fc>''\\ |
| ''<fc #800000>FMOV</fc> <fc #008000>X0</fc>, <fc #008000>D1</fc> <fc #6495ed> @ FP64 → int64 (copy from vector register to general-purpose register)</fc>''\\ | ''<fc #800000>FMOV</fc> <fc #008000>X0</fc>, <fc #008000>D1</fc> <fc #6495ed> @ FP64 → int64 (copy from vector register to general-purpose register)</fc>''\\ |