Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:multiasm:papc:chapter_6_7 [2025/10/22 11:08] – [BMI1 and BMI2 Instructions] ktokarzen:multiasm:papc:chapter_6_7 [2026/04/01 14:13] (current) – [Instruction Set of x64 - Essentials] ktokarz
Line 1: Line 1:
-====== Instruction Set of x86 - Essentials ======+====== Instruction Set of x64 - Essentials ====== 
 +The x64 processors can execute an extensive number of different instructions. As processors have evolved, the instruction set has expanded from the initial 117 in the 8086 processor to over 1000 in modern 64-bit designs. In this chapter, we present the instruction groups and a description of essential instructions called general-purpose instructions. 
 ===== Instruction groups ===== ===== Instruction groups =====
-The x64 processors can execute an extensive number of different instructions. In the documentation of processors, we can find several ways of dividing all instructions into groups. The most general division, according to AMD, defines five groups of instructions:+In the documentation of processors, we can find several ways of dividing all instructions into groups. The most general division, according to AMD, defines five groups of instructions:
   * General Purpose instructions   * General Purpose instructions
   * System instructions   * System instructions
Line 105: Line 107:
 In the **mov** instruction, the size of the source argument must be the same as the size of the destination argument. Arguments can be stored in registers, in memory addressed directly or indirectly. One of them can be constant (immediate). Only one memory argument is allowed. This comes from instructions encoding. In instructions, there is only one possible direct or indirect argument to be encoded. That's why most instructions, not only **mov**, can operate with one memory argument only. There are some exceptions, for example, string instructions, but such instructions use specific indirect addressing. In the **mov** instruction, the size of the source argument must be the same as the size of the destination argument. Arguments can be stored in registers, in memory addressed directly or indirectly. One of them can be constant (immediate). Only one memory argument is allowed. This comes from instructions encoding. In instructions, there is only one possible direct or indirect argument to be encoded. That's why most instructions, not only **mov**, can operate with one memory argument only. There are some exceptions, for example, string instructions, but such instructions use specific indirect addressing.
 <code asm> <code asm>
-mov al, 100        ;0xB0, 0x64         copy constant (immediate) of the value 100 (0x64) to al +mov al, 100        ;0xB0, 0x64 
-mov al, [bx]       ;0x67, 0x8A, 0x07   copy byte from the memory at address stored in bx to al (indirect addressing)+                   ;copy constant (immediate) of the value 100 (0x64) to al 
 +                    
 +mov al, [bx]       ;0x67, 0x8A, 0x07 
 +                   ;copy byte from the memory at address stored in bx to al  
 +                   ;(indirect addressing)
  
 ;Notice the difference between two following instructions ;Notice the difference between two following instructions
-mov eax, 100       ;0xB8, 0x64, 0x00, 0x00, 0x00   copy constant 100 to eax +mov eax, 100       ;0xB8, 0x64, 0x00, 0x00, 0x00 
-mov eax, [100]     ;0xA1, 0x64, 0x00, 0x00, 0x00   copy value from memory at address 100+                   ;copy constant 100 to eax 
 +                    
 +mov eax, [100]     ;0xA1, 0x64, 0x00, 0x00, 0x00    
 +                   ;copy value from memory at address 100
  
 ;It is possible to copy a constant to memory addressed directly or indirectly ;It is possible to copy a constant to memory addressed directly or indirectly
-;operand size specifier dword ptr is required to inform the processor about the size of the argument +;operand size specifier dword ptr is required  
-mov dword ptr ds:[200], 100   ;0xC7, 0x05, 0xC8, 0x00, 0x00, 0x00, 0x64, 0x00, 0x00, 0x00 +;to inform the processor about the size of the argument 
-                              ;copy value of 100, encoded as dword (four bytes), 0x64 = 100 +mov dword ptr ds:[200], 100    
-                              ;to memory at address 200, encoded as four bytes,  0xC8 = 200+                   ;0xC7, 0x05, 0xC8, 0x00, 0x00, 0x00, 0x64, 0x00, 0x00, 0x00 
 +                   ;copy value of 100, encoded as dword (four bytes), 0x64 = 100 
 +                   ;to memory at address 200, encoded as four bytes,  0xC8 = 200
                                                              
-mov dword ptr [ebx], 100      ;0xC7, 0x03, 0x64, 0x00, 0x00, 0x00 +mov dword ptr [ebx], 100 
-                              ;copy value of 100, encoded as dword (four bytes), 0x64 = 100 +                   ;0xC7, 0x03, 0x64, 0x00, 0x00, 0x00 
-                              ;to memory addressed by ebx +                   ;copy value of 100, encoded as dword (four bytes), 0x64 = 100 
 +                   ;to memory addressed by ebx 
 </code> </code>
 ==== Conditional move ==== ==== Conditional move ====
Line 159: Line 171:
   * **cwde** - converts word in ax to doubleword extended in eax   * **cwde** - converts word in ax to doubleword extended in eax
   * **cdq** - converts doubleword in eax to quadword in edx:eax   * **cdq** - converts doubleword in eax to quadword in edx:eax
 +  * **cdqe** - convert doubleword in eax to quadword in rax
 +  * **cqo** - convert quadword in rax to double quadword in rdx:rax
  
-Sign extension instructions work solely with the accumulator. Fortunately, there are also more universal instructions which copy and extex data at the same time. +Sign extension instructions work solely with the accumulator. Fortunately, there are also more universal instructions which copy and extend data at the same time. 
   * **movsx** - copies and sign-extends a byte to a word or doubleword or word to doubleword.   * **movsx** - copies and sign-extends a byte to a word or doubleword or word to doubleword.
   * **movzx** - copies and zero-extends a byte to a word or doubleword or word to doubleword.   * **movzx** - copies and zero-extends a byte to a word or doubleword or word to doubleword.
Line 283: Line 297:
  
 The **set//cc//** instruction sets the argument to 1 if the chosen condition is met, or clears the argument if the condition is not met. The condition can be freely chosen from the set of conditions available for other instructions, for example, **cmov//cc//**. This instruction is useful to convert the result of the operation into the Boolean representation. The **set//cc//** instruction sets the argument to 1 if the chosen condition is met, or clears the argument if the condition is not met. The condition can be freely chosen from the set of conditions available for other instructions, for example, **cmov//cc//**. This instruction is useful to convert the result of the operation into the Boolean representation.
 +
 +The **popcnt** instruction counts the number of bits equal to "1" in a data. The applications af this instruction include genome mining, handwriting recognition, digital health workloads, and fast hamming distance counts((https://patents.google.com/patent/US8214414)).
 +
 +The **crc32** instruction implements the calculation of the cyclic redundancy check in hardware. The polynomial of the value 11EDC6F41h is fixed.
 +
 ===== Control transfer instructions ===== ===== Control transfer instructions =====
 Before describing the instructions used for control transfer, we will discuss how the destination address can be calculated. The destination address is the address given to the processor to make a jump to.  Before describing the instructions used for control transfer, we will discuss how the destination address can be calculated. The destination address is the address given to the processor to make a jump to. 
Line 383: Line 402:
 ==== String compare ==== ==== String compare ====
 Strings can be compared, which means that the element of the destination string is compared with the element of the source string. These instructions set the status flags in the flags register according to the result of the comparison. The elements of both strings remain unchanged. Strings can be compared, which means that the element of the destination string is compared with the element of the source string. These instructions set the status flags in the flags register according to the result of the comparison. The elements of both strings remain unchanged.
-The **cmps** instruction compares the element of a source string with the element of the destination string. It requires one argument, which specifies the size of the accumulator and the data element.+The **cmps** instruction compares the element of a source string with the element of the destination string. It requires two arguments, which specify the size of the data elements.
 The **cmpsb** instruction compares a byte from the source string with a byte from the destination string. The **cmpsb** instruction compares a byte from the source string with a byte from the destination string.
 The **cmpsw** instruction compares a word from the source string with a word from the destination string. The **cmpsw** instruction compares a word from the source string with a word from the destination string.
Line 531: Line 550:
 Other instructions manipulate bits as the group name stays. Other instructions manipulate bits as the group name stays.
  
-LZCNT counts zeros from the most significant bit (MSB), while TZCNT counts from the least significant bit (LSB). For a non-zero inputLZCNT returns the number of zeros before the first 1 from the left, and TZCNT returns the number of zeros before the first 1 from the right. +The **lzcnt** instruction counts the number of zeros in an argument starting from the most significant bit. The **tzcnt** counts zeros starting from the least significant bit. For an argument that is not zero, **lzcnt** returns the number of zeros before the first 1 from the left, and **tzcnt** gives the number of zeros before the first 1 from the right.  
 +The **bextr** instruction copies the number of bits from source to destination arguments starting at the chosen position. The third argument specifies the number of bits and the starting bit position. Bits 7:0 of the third operand specify the starting bit position, while bits 15:8 specify the maximum number of bits to extract, as shown in figure {{ref>bextr_instr}}. 
 + 
 +<figure bextr_instr> 
 +{{ :en:multiasm:cs:bextr.png?400 |Illustration of bit extraction instruction}} 
 +<caption>Illustration of bit extraction instruction</caption> 
 +</figure> 
 + 
 +The **blsi** instruction extracts the single, lowest bit set to one, as shown in figure {{ref>blsi_instr}}. 
 + 
 +<figure blsi_instr> 
 +{{ :en:multiasm:cs:blsi.png?400 |Illustration of the lowest set bit extraction instruction}} 
 +<caption>Illustration of lowest set bit extraction instruction</caption> 
 +</figure> 
 + 
 +The **blsmsk** instruction sets all lower bits below a first bit set to 1. It is shown in figure {{ref>blsmsk_instr}}. 
 + 
 +<figure blsmsk_instr> 
 +{{ :en:multiasm:cs:blsmsk.png?400 |Illustration of the instruction which sets all lower bits below a first bit set to 1.}} 
 +<caption>Illustration of the instruction which sets all lower bits below a first bit set to 1</caption> 
 +</figure> 
 + 
 +The **blsr** instruction resets (clears the bit to zero value) the lowest set bit. It is shown in figure {{ref>blsr_instr}}. 
 + 
 +<figure blsr_instr> 
 +{{ :en:multiasm:cs:blsr.png?400 |Illustration of the instruction which resets a first bit set to 1.}} 
 +<caption>Illustration of the instruction which resets a first bit set to 1</caption> 
 +</figure> 
 + 
 +The **bzhi** instruction resets high bits starting from the specified bit position, as shown in figure {{ref>bzhi_instr}}. 
 + 
 +<figure bzhi_instr> 
 +{{ :en:multiasm:cs:bzhi.png?400 |Illustration of the instruction which resets high bits starting from the specified bit position.}} 
 +<caption>Illustration of the instruction which resets high bits starting from the specified bit position</caption> 
 +</figure> 
 + 
 +The **pdep** instruction performs a parallel deposit of bits using a mask. Its behaviour is shown in figure {{ref>pdep_instr}}. 
 + 
 +<figure pdep_instr> 
 +{{ :en:multiasm:cs:pdep.png?600 |Illustration of the parallel deposit instruction}} 
 +<caption>Illustration of the parallel deposit instruction</caption> 
 +</figure> 
 + 
 +The **pext** instruction performs a parallel extraction of bits using a mask. Its behaviour is shown in figure {{ref>pext_instr}}. 
 + 
 +<figure pext_instr> 
 +{{ :en:multiasm:cs:pext.png?600 |Illustration of the parallel extraction instruction}} 
 +<caption>Illustration of the parallel extraction instruction</caption> 
 +</figure>
  
-BEXTR Contiguous bitwise extract. 
-BLSI Extract lowest set bit. 
-BLSMSK Set all lower bits below first set bit to 1. 
-BLSR Reset lowest set bit. 
-BZHI Zero high bits starting from specified bit position. 
-LZCNT Count the number leading zero bits. 
-PDEP Parallel deposit of bits using a mask. 
-PEXT Parallel extraction of bits using a mask. 
-TZCNT Count the number trailing zero bits. 
en/multiasm/papc/chapter_6_7.1761120494.txt.gz · Last modified: by ktokarz
CC Attribution-Share Alike 4.0 International
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0