Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:multiasm:papc:chapter_6_6 [2025/08/01 09:18] – [Scale Index Base byte] ktokarzen:multiasm:papc:chapter_6_6 [2026/02/19 20:48] (current) – [Scale Index Base byte] ktokarz
Line 78: Line 78:
  
 The **lock prefix** is valid for instructions that work in a read-modify-write manner. An example of such an instruction can be adding a constant or register content to the variable in the memory. The **lock prefix** is valid for instructions that work in a read-modify-write manner. An example of such an instruction can be adding a constant or register content to the variable in the memory.
-<code> lock add QWORD PTR [rax], 5 </code>+<code asm> lock add QWORD PTR [rax], 5 </code>
 The lock prefix appears as a single byte with the value 0x0F before the opcode. It disables DMA requests (or any other requests that gain control of the buses) during the execution of the instruction to prevent accidental modification of the memory contents at the same address by both the processor and DMA controller. The lock prefix appears as a single byte with the value 0x0F before the opcode. It disables DMA requests (or any other requests that gain control of the buses) during the execution of the instruction to prevent accidental modification of the memory contents at the same address by both the processor and DMA controller.
  
Line 86: Line 86:
 <code asm> <code asm>
 mov BYTE PTR [ebx], 5     ;DS as the default segment mov BYTE PTR [ebx], 5     ;DS as the default segment
-mov BYTE PTR ES:[ebx], 5  ;ES segment override (results in appearance of the byte 0x26 as the prefix)+mov BYTE PTR ES:[ebx], 5  ;ES segment override  
 +                          ;(results in appearance of the byte 0x26 as the prefix)
 </code>  </code> 
   * 0x2E – CS segment override   * 0x2E – CS segment override
Line 100: Line 101:
 The **operand size** and **address size override** prefixes can change the default size of operands and addresses. For example, if the processor operates in 32-bit mode, using the 0x66 prefix changes the size of an operand to 16 bits, and using the 0x67 prefix changes the address encoding from 32 bits to 16 bits. To better understand the behaviour of prefixes, let us consider a simple instruction with different variants. Let's start with a 32-bit processor. The **operand size** and **address size override** prefixes can change the default size of operands and addresses. For example, if the processor operates in 32-bit mode, using the 0x66 prefix changes the size of an operand to 16 bits, and using the 0x67 prefix changes the address encoding from 32 bits to 16 bits. To better understand the behaviour of prefixes, let us consider a simple instruction with different variants. Let's start with a 32-bit processor.
 <code asm> <code asm>
-mov BYTE PTR [ebx], 0x5   ;encoded as 0xC6, 0x03, 0x05 +                          ;encoding 
-mov WORD PTR [ebx], 0x5   ;encoded as 0x66, 0xC7, 0x03, 0x05, 0x00  +mov BYTE PTR [ebx], 0x5   ;0xC6, 0x03, 0x05 
-mov DWORD PTR [ebx], 0x5  ;encoded as 0xC7, 0x03, 0x05, 0x00, 0x00, 0x00+mov WORD PTR [ebx], 0x5   ;0x66, 0xC7, 0x03, 0x05, 0x00  
 +mov DWORD PTR [ebx], 0x5  ;0xC7, 0x03, 0x05, 0x00, 0x00, 0x00
 </code> </code>
  
-We can notice that because the default operand size is a 32-bit doubleword, so prefix 0x66 appears in the 16-bit version (WORD PTR). It is also visible that the 8-bit version (BYTE PTR) has a different opcode (0xC6, 0x03 instead of 0xC7, 0x03). Also, the size of the argument is different.+We can notice that because the default operand size is a 32-bit doubleword, the prefix 0x66 appears in the 16-bit version (WORD PTR). It is also visible that the 8-bit version (BYTE PTR) has a different opcode (0xC6, 0x03 instead of 0xC7, 0x03). Also, the size of the argument is different.
  
 The address override prefix (0x67) appears if we change the register to a 16-bit bx. The address override prefix (0x67) appears if we change the register to a 16-bit bx.
 <code asm> <code asm>
-mov BYTE PTR [bx], 0x5   ;encoded as 0x67, 0xC6, 0x07, 0x05 +                         ;encoding 
-mov WORD PTR [bx], 0x5   ;encoded as 0x67, 0x66, 0xC7, 0x07, 0x05, 0x00  +mov BYTE PTR [bx], 0x5   ;0x67, 0xC6, 0x07, 0x05 
-mov DWORD PTR [bx], 0x5  ;encoded as 0x67, 0xC7, 0x07, 0x05, 0x00, 0x00, 0x00+mov WORD PTR [bx], 0x5   ;0x67, 0x66, 0xC7, 0x07, 0x05, 0x00  
 +mov DWORD PTR [bx], 0x5  ;0x67, 0xC7, 0x07, 0x05, 0x00, 0x00, 0x00
 </code> </code>
  
 The same situation can be observed if we use a 32-bit address register (ebx) and assemble the same instructions for a 64-bit processor. The same situation can be observed if we use a 32-bit address register (ebx) and assemble the same instructions for a 64-bit processor.
 <code asm> <code asm>
-mov BYTE PTR [ebx], 0x5   ;encoded as 0x67, 0xC6, 0x03, 0x05 +                          ;encoding 
-mov WORD PTR [ebx], 0x5   ;encoded as 0x67, 0x66, 0xC7, 0x03, 0x05, 0x00  +mov BYTE PTR [ebx], 0x5   ;0x67, 0xC6, 0x03, 0x05 
-mov DWORD PTR [ebx], 0x5  ;encoded as 0x67, 0xC7, 0x03, 0x05, 0x00, 0x00, 0x00+mov WORD PTR [ebx], 0x5   ;0x67, 0x66, 0xC7, 0x03, 0x05, 0x00  
 +mov DWORD PTR [ebx], 0x5  ;0x67, 0xC7, 0x03, 0x05, 0x00, 0x00, 0x00
 </code> </code>
  
 While we use a native 64-bit address register in a 64-bit processor, the address size override prefix disappears. While we use a native 64-bit address register in a 64-bit processor, the address size override prefix disappears.
 <code asm> <code asm>
-mov BYTE PTR [rbx], 0x5   ;encoded as 0xC6, 0x03, 0x05 +                          ;encoding 
-mov WORD PTR [rbx], 0x5   ;encoded as 0x66, 0xC7, 0x03, 0x05, 0x00  +mov BYTE PTR [rbx], 0x5   ;0xC6, 0x03, 0x05 
-mov DWORD PTR [rbx], 0x5  ;encoded as 0xC7, 0x03, 0x05, 0x00, 0x00, 0x00+mov WORD PTR [rbx], 0x5   ;0x66, 0xC7, 0x03, 0x05, 0x00  
 +mov DWORD PTR [rbx], 0x5  ;0xC7, 0x03, 0x05, 0x00, 0x00, 0x00
 </code> </code>
  
Line 142: Line 147:
  
 <code asm> <code asm>
-mov BYTE PTR [r8], 0x5    ;encoded as 0x41, 0xC6, 0x00, 0x05 +                          ;encoding 
-mov BYTE PTR [r9], 0x5    ;encoded as 0x41, 0xC6, 0x01, 0x05 +mov BYTE PTR [r8], 0x5    ;0x41, 0xC6, 0x00, 0x05 
-mov BYTE PTR [r10], 0x5   ;encoded as 0x41, 0xC6, 0x02, 0x05 +mov BYTE PTR [r9], 0x5    ;0x41, 0xC6, 0x01, 0x05 
-mov DWORD PTR [r8], 0x5   ;encoded as 0x41, 0xC7, 0x00, 0x05, 0x00, 0x00, 0x00  +mov BYTE PTR [r10], 0x5   ;0x41, 0xC6, 0x02, 0x05 
-mov QWORD PTR [rbx], 0x5  ;encoded as 0x48, 0xC7, 0x03, 0x05, 0x00, 0x00, 0x00 +mov DWORD PTR [r8], 0x5   ;0x41, 0xC7, 0x00, 0x05, 0x00, 0x00, 0x00  
-mov QWORD PTR [r8], 0x5   ;encoded as 0x49, 0xC7, 0x00, 0x05, 0x00, 0x00, 0x00+mov QWORD PTR [rbx], 0x5  ;0x48, 0xC7, 0x03, 0x05, 0x00, 0x00, 0x00 
 +mov QWORD PTR [r8], 0x5   ;0x49, 0xC7, 0x00, 0x05, 0x00, 0x00, 0x00
 </code> </code>
 The REX prefix  The REX prefix 
Line 154: Line 160:
 =====Instruction opcode===== =====Instruction opcode=====
 The instruction opcode is the mandatory field in every instruction. It encodes the main function of the operation. Expanding the processor's capabilities by adding new instructions required defining longer opcodes. The opcode can be 1, 2 or 3 bytes in length. New instructions usually contain an additional byte or two bytes at the beginning called an escape sequence. Possible opcode sequences are: The instruction opcode is the mandatory field in every instruction. It encodes the main function of the operation. Expanding the processor's capabilities by adding new instructions required defining longer opcodes. The opcode can be 1, 2 or 3 bytes in length. New instructions usually contain an additional byte or two bytes at the beginning called an escape sequence. Possible opcode sequences are:
-<code>+<code asm>
 opcode opcode
 0x0F opcode 0x0F opcode
Line 166: Line 172:
   * 0x8F Three-byte XOP   * 0x8F Three-byte XOP
 VEX-encoded instructions are written with V at the beginning. Let's look at the example of the blending instruction. VEX-encoded instructions are written with V at the beginning. Let's look at the example of the blending instruction.
-<code> +<code asm> 
-blendvpd xmm0, xmm1               ; encoded as 0x66, 0x0F, 0x38, 0x15, 0xC1  +                                  ;encoding 
-vblendvpd xmm0, xmm1, xmm2, xmm3  ; encoded as 0xC4, 0xE3, 0x71, 0x4B, 0xC2, 0x30 +blendvpd xmm0, xmm1               ;0x66, 0x0F, 0x38, 0x15, 0xC1  
 +vblendvpd xmm0, xmm1, xmm2, xmm3  ;0xC4, 0xE3, 0x71, 0x4B, 0xC2, 0x30 
 </code> </code>
 The first blendvpd instruction has only two arguments; in this encoding scheme is not possible to encode more. It uses the mandatory prefix 0x66 and 0x0F, 0x38 escape sequence. The second version, vblendvpd, has four arguments. It is encoded with a three-byte VEX escape sequence 0xC4, 0xE3, 0x71. The first blendvpd instruction has only two arguments; in this encoding scheme is not possible to encode more. It uses the mandatory prefix 0x66 and 0x0F, 0x38 escape sequence. The second version, vblendvpd, has four arguments. It is encoded with a three-byte VEX escape sequence 0xC4, 0xE3, 0x71.
Line 200: Line 207:
 Let's look at some examples of instruction encoding. First, look at the data transfer between two registers. Let's look at some examples of instruction encoding. First, look at the data transfer between two registers.
 <code asm> <code asm>
-              ;                         MOD REG R/M   MOD               REG   R/M +              ;encoding      MOD REG R/M   MOD               REG   R/M 
-mov al, dl    ;encoded as 0x88, 0xD0    11  010 000   Register operand  DL    AL +mov al, dl    ;0x88, 0xD0    11  010 000   Register operand  DL    AL 
-mov ax, dx    ;encoded as 0x89, 0xD0    11  010 000   Register operand  DX    AX +mov ax, dx    ;0x89, 0xD0    11  010 000   Register operand  DX    AX 
-mov dx, si    ;encoded as 0x89, 0xF2    11  110 010   Register operand  SI    DX +mov dx, si    ;0x89, 0xF2    11  110 010   Register operand  SI    DX 
-mov si, dx    ;encoded as 0x89, 0xD6    11  010 110   Register operand  DX    SI+mov si, dx    ;0x89, 0xD6    11  010 110   Register operand  DX    SI
 </code> </code>
 Notice that in the first and second lines, different opcodes are used, but the MOD R/M bytes are identical. The type of instruction determines the order of data transfer. Notice that in the first and second lines, different opcodes are used, but the MOD R/M bytes are identical. The type of instruction determines the order of data transfer.
Line 210: Line 217:
 Now, a few examples of indirect addressing without displacement. Now, a few examples of indirect addressing without displacement.
 <code asm> <code asm>
-                                      MOD REG R/M   MOD               REG   R/    +               ;encoding      MOD REG R/M   MOD               REG   R/    
-mov dx,[si]   ;encoded as 0x8B, 0x14    00  010 100   Reg. only addr.   DX    [SI] +mov dx,[si]    ;0x8B, 0x14    00  010 100   Reg. only addr.   DX    [SI] 
-mov dx,[di]   ;encoded as 0x8B, 0x15    00  010 101   Reg. only addr.   DX    [DI] +mov dx,[di]    ;0x8B, 0x15    00  010 101   Reg. only addr.   DX    [DI] 
-mov dx,[bx+di];encoded as 0x8B, 0x11    00  010 001   Reg. only addr.   DX    [BX+DI] +mov dx,[bx+di] ;0x8B, 0x11    00  010 001   Reg. only addr.   DX    [BX+DI] 
-mov cx,[bx+di];encoded as 0x8B, 0x09    00  001 001   Reg. only addr.   CX    [BX+DI]+mov cx,[bx+di] ;0x8B, 0x09    00  001 001   Reg. only addr.   CX    [BX+DI]
 </code> </code>
  
 Now, a few examples of indirect addressing with displacement. Now, a few examples of indirect addressing with displacement.
 <code asm> <code asm>
-                                          MOD REG R/M   MOD              REG  R/M       Disp   +               ;encoding          MOD REG R/M   MOD             REG R/M       Disp   
-mov dx,[bp+62];encoded as 0x8B, 0x56, 0x3E  01  010 110   Reg.+disp addr.  DX   [BP+disp] 0x3E +mov dx,[bp+62] ;0x8B, 0x56, 0x3E  01  010 110   Reg.+disp addr. DX  [BP+disp] 0x3E 
-mov [bp+62],dx;encoded as 0x89, 0x56, 0x3E  01  010 110   Reg.+disp addr.  DX   [BP+disp] 0x3E +mov [bp+62],dx ;0x89, 0x56, 0x3E  01  010 110   Reg.+disp addr. DX  [BP+disp] 0x3E 
-mov dx,[si+13];encoded as 0x8B, 0x54, 0x0D  01  010 100   Reg.+disp addr.  DX   [SI+disp] 0x0D +mov dx,[si+13] ;0x8B, 0x54, 0x0D  01  010 100   Reg.+disp addr. DX  [SI+disp] 0x0D 
-mov si,[bp]   ;encoded as 0x8B, 0x76, 0x00  01  110 110   Reg.+disp addr.  SI   [BP+disp] 0x00+mov si,[bp]    ;0x8B, 0x76, 0x00  01  110 110   Reg.+disp addr. SI  [BP+disp] 0x00
 </code> </code>
-If we look in fort two lines, we can observe that the MOD R/M bytes are identical. The only difference is the opcode, which determines the direction of the data transfer.+If we look at the first two lines, we can observe that the MOD R/M bytes are identical. The only difference is the opcode, which determines the direction of the data transfer.
  
 Notice also that the last instruction is encoded as BP + displacement, even if there is no displacement in the mnemonic. If you look into the table {{ref>modrm_16}}, you can observe that there is no addressing mode with [BP] only. It must appear with the displacement. Notice also that the last instruction is encoded as BP + displacement, even if there is no displacement in the mnemonic. If you look into the table {{ref>modrm_16}}, you can observe that there is no addressing mode with [BP] only. It must appear with the displacement.
Line 292: Line 299:
 | 64-bit index register  | RAX    | RCX    | RDX    | RBX    | *      | RBP    | RSI    | RDI    | | 64-bit index register  | RAX    | RCX    | RDX    | RBX    | *      | RBP    | RSI    | RDI    |
 || ||
-^ Bits B.Index           ^ 1.000  ^ 1.001  ^ 1.010  ^ 1.011  ^ 1.100  ^ 1.101  ^ 1.110  ^ 1.111  ^+^ Bits X.Index           ^ 1.000  ^ 1.001  ^ 1.010  ^ 1.011  ^ 1.100  ^ 1.101  ^ 1.110  ^ 1.111  ^
 | 32-bit index register  | R8     | R9     | R10    | R11    | R12    | R13    | R14    | R15    | | 32-bit index register  | R8     | R9     | R10    | R11    | R12    | R13    | R14    | R15    |
  
Line 311: Line 318:
 <code asm> <code asm>
 ;MOD R/M (second byte) is 0x04 for all instructions: ;MOD R/M (second byte) is 0x04 for all instructions:
-                       ;                    MOD REG R/M   REG  MOD & R/M +                     ;                    MOD REG R/M   REG  MOD & R/M 
-                       ;                     00 000 100   eax  SIB is present+                     ;                     00 000 100   eax  SIB is present
  
 ;SIB (third byte) is 0x0B, 0x4B, 0x8B or 0xCB: ;SIB (third byte) is 0x0B, 0x4B, 0x8B or 0xCB:
-                                        Scale Index Base  Scale Index Base +                                     Scale Index Base Scale Index Base 
-mov eax, [ebx+ecx]     ;0x8B, 0x04, 0x0B     00   001  011     x1   ecx  ebx +mov eax, [ebx+ecx]   ;0x8B, 0x04, 0x0B    00   001  011    x1   ecx  ebx 
-mov eax, [ebx+ecx*2]   ;0x8B, 0x04, 0x4B     01   001  011     x2   ecx  ebx +mov eax, [ebx+ecx*2] ;0x8B, 0x04, 0x4B    01   001  011    x2   ecx  ebx 
-mov eax, [ebx+ecx*4]   ;0x8B, 0x04, 0x8B     10   001  011     x4   ecx  ebx +mov eax, [ebx+ecx*4] ;0x8B, 0x04, 0x8B    10   001  011    x4   ecx  ebx 
-mov eax, [ebx+ecx*8]   ;0x8B, 0x04, 0xCB     11   001  011     x8   ecx  ebx+mov eax, [ebx+ecx*8] ;0x8B, 0x04, 0xCB    11   001  011    x8   ecx  ebx
 </code> </code>
  
Line 326: Line 333:
 <code asm> <code asm>
 ;REX prefix (first byte) is 0x48 for all instructions: ;REX prefix (first byte) is 0x48 for all instructions:
-                       ;                                             0 +                     ;                                             0 
-                       ;                 +---+---+---+---+---+---+---+---+ +                     ;                 +---+---+---+---+---+---+---+---+ 
-                       ;                 | 0       0 | W | R | X | B | +                     ;                 | 0       0 | W | R | X | B | 
-                       ;                 +---+---+---+---+---+---+---+---+ +                     ;                 +---+---+---+---+---+---+---+---+ 
-                       ;                                         0+                     ;                                         0
  
 +;MOD R/M (second byte) is 0x04 for all instructions:
 +                     ;                    MOD R.REG R/M   REG  MOD & R/M
 +                     ;                     00 0.000 100   eax  SIB is present
  
-                                              Scale X.Index B.Base  Scale Index Base +                                           Scale X.Index B.Base Scale Index Base 
-mov rax, [rbx+rcx]     ;0x48, 0x8B, 0x04, 0x0B     00   0.001  0.011     x1   ecx  ebx +mov rax, [rbx+rcx]   ;0x48, 0x8B, 0x04, 0x0B    00   0.001  0.011    x1   rcx  rbx 
-mov rax, [rbx+rcx*2]   ;0x48, 0x8B, 0x04, 0x4B     01   0.001  0.011     x2   ecx  ebx +mov rax, [rbx+rcx*2] ;0x48, 0x8B, 0x04, 0x4B    01   0.001  0.011    x2   rcx  rbx 
-mov rax, [rbx+rcx*4]   ;0x48, 0x8B, 0x04, 0x8B     10   0.001  0.011     x4   ecx  ebx +mov rax, [rbx+rcx*4] ;0x48, 0x8B, 0x04, 0x8B    10   0.001  0.011    x4   rcx  rbx 
-mov rax, [rbx+rcx*8]   ;0x48, 0x8B, 0x04, 0xCB     11   0.001  0.011     x8   ecx  ebx+mov rax, [rbx+rcx*8] ;0x48, 0x8B, 0x04, 0xCB    11   0.001  0.011    x8   rcx  rbx
 </code> </code>
 +
 +If any of the new registers (R8-R15) is used in the instruction, it changes the bits in the REX prefix.
 +
 +<code asm>
 +                     ;                       Scale X.Index B.Base Scale Index Base
 +mov rax, [r10+rcx]   ;0x49, 0x8B, 0x04, 0x0A    00   0.001  1.010    x1   rcx  r10
 +mov rax, [rbx+r11]   ;0x4A, 0x8B, 0x04, 0x1B    00   1.001  0.011    x1   r11  rbx
 +mov r12, [rbx+rcx]   ;0x4C, 0x8B, 0x24, 0x0B    10   0.001  0.011    x1   rcx  rbx
 +
 +                     ;Last instruction has the MOD R/M REG field extended 
 +                     ;by the R bit from the REX prefix.
 +                     ;                    MOD R.REG R/M   REG  MOD & R/M
 +                     ;                     00 1.100 100   r12  SIB is present
 +</code>
 +
 +Certainly, the presented examples do not exhaust all possible situations. For a more detailed explanation, please refer to the documentation by AMD((https://docs.amd.com/v/u/en-US/40332-PUB_4.08)), Intel((https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html)), OSDev wiki((https://wiki.osdev.org/X86-64)) or other interesting sources mentioned at the bottom of this section.
 =====Displacement===== =====Displacement=====
 Displacement gives the offset for memory operands. Depending on the addressing mode, it can be the direct memory address or an additional offset added to the contents of the base, index register or both. Displacement can be 1, 2, or 4 bytes long. Some instructions allow using an 8-byte displacement. In these instructions, there is no immediate field. Displacement gives the offset for memory operands. Depending on the addressing mode, it can be the direct memory address or an additional offset added to the contents of the base, index register or both. Displacement can be 1, 2, or 4 bytes long. Some instructions allow using an 8-byte displacement. In these instructions, there is no immediate field.
en/multiasm/papc/chapter_6_6.1754029082.txt.gz · Last modified: by ktokarz
CC Attribution-Share Alike 4.0 International
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0