Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:multiasm:papc:chapter_6_7 [2025/10/23 12:10] – [Bit and Byte Instructions] ktokarzen:multiasm:papc:chapter_6_7 [2026/04/01 14:13] (current) – [Instruction Set of x64 - Essentials] ktokarz
Line 1: Line 1:
-====== Instruction Set of x86 - Essentials ======+====== Instruction Set of x64 - Essentials ====== 
 +The x64 processors can execute an extensive number of different instructions. As processors have evolved, the instruction set has expanded from the initial 117 in the 8086 processor to over 1000 in modern 64-bit designs. In this chapter, we present the instruction groups and a description of essential instructions called general-purpose instructions. 
 ===== Instruction groups ===== ===== Instruction groups =====
-The x64 processors can execute an extensive number of different instructions. In the documentation of processors, we can find several ways of dividing all instructions into groups. The most general division, according to AMD, defines five groups of instructions:+In the documentation of processors, we can find several ways of dividing all instructions into groups. The most general division, according to AMD, defines five groups of instructions:
   * General Purpose instructions   * General Purpose instructions
   * System instructions   * System instructions
Line 105: Line 107:
 In the **mov** instruction, the size of the source argument must be the same as the size of the destination argument. Arguments can be stored in registers, in memory addressed directly or indirectly. One of them can be constant (immediate). Only one memory argument is allowed. This comes from instructions encoding. In instructions, there is only one possible direct or indirect argument to be encoded. That's why most instructions, not only **mov**, can operate with one memory argument only. There are some exceptions, for example, string instructions, but such instructions use specific indirect addressing. In the **mov** instruction, the size of the source argument must be the same as the size of the destination argument. Arguments can be stored in registers, in memory addressed directly or indirectly. One of them can be constant (immediate). Only one memory argument is allowed. This comes from instructions encoding. In instructions, there is only one possible direct or indirect argument to be encoded. That's why most instructions, not only **mov**, can operate with one memory argument only. There are some exceptions, for example, string instructions, but such instructions use specific indirect addressing.
 <code asm> <code asm>
-mov al, 100        ;0xB0, 0x64         copy constant (immediate) of the value 100 (0x64) to al +mov al, 100        ;0xB0, 0x64 
-mov al, [bx]       ;0x67, 0x8A, 0x07   copy byte from the memory at address stored in bx to al (indirect addressing)+                   ;copy constant (immediate) of the value 100 (0x64) to al 
 +                    
 +mov al, [bx]       ;0x67, 0x8A, 0x07 
 +                   ;copy byte from the memory at address stored in bx to al  
 +                   ;(indirect addressing)
  
 ;Notice the difference between two following instructions ;Notice the difference between two following instructions
-mov eax, 100       ;0xB8, 0x64, 0x00, 0x00, 0x00   copy constant 100 to eax +mov eax, 100       ;0xB8, 0x64, 0x00, 0x00, 0x00 
-mov eax, [100]     ;0xA1, 0x64, 0x00, 0x00, 0x00   copy value from memory at address 100+                   ;copy constant 100 to eax 
 +                    
 +mov eax, [100]     ;0xA1, 0x64, 0x00, 0x00, 0x00    
 +                   ;copy value from memory at address 100
  
 ;It is possible to copy a constant to memory addressed directly or indirectly ;It is possible to copy a constant to memory addressed directly or indirectly
-;operand size specifier dword ptr is required to inform the processor about the size of the argument +;operand size specifier dword ptr is required  
-mov dword ptr ds:[200], 100   ;0xC7, 0x05, 0xC8, 0x00, 0x00, 0x00, 0x64, 0x00, 0x00, 0x00 +;to inform the processor about the size of the argument 
-                              ;copy value of 100, encoded as dword (four bytes), 0x64 = 100 +mov dword ptr ds:[200], 100    
-                              ;to memory at address 200, encoded as four bytes,  0xC8 = 200+                   ;0xC7, 0x05, 0xC8, 0x00, 0x00, 0x00, 0x64, 0x00, 0x00, 0x00 
 +                   ;copy value of 100, encoded as dword (four bytes), 0x64 = 100 
 +                   ;to memory at address 200, encoded as four bytes,  0xC8 = 200
                                                              
-mov dword ptr [ebx], 100      ;0xC7, 0x03, 0x64, 0x00, 0x00, 0x00 +mov dword ptr [ebx], 100 
-                              ;copy value of 100, encoded as dword (four bytes), 0x64 = 100 +                   ;0xC7, 0x03, 0x64, 0x00, 0x00, 0x00 
-                              ;to memory addressed by ebx +                   ;copy value of 100, encoded as dword (four bytes), 0x64 = 100 
 +                   ;to memory addressed by ebx 
 </code> </code>
 ==== Conditional move ==== ==== Conditional move ====
Line 159: Line 171:
   * **cwde** - converts word in ax to doubleword extended in eax   * **cwde** - converts word in ax to doubleword extended in eax
   * **cdq** - converts doubleword in eax to quadword in edx:eax   * **cdq** - converts doubleword in eax to quadword in edx:eax
 +  * **cdqe** - convert doubleword in eax to quadword in rax
 +  * **cqo** - convert quadword in rax to double quadword in rdx:rax
  
-Sign extension instructions work solely with the accumulator. Fortunately, there are also more universal instructions which copy and extex data at the same time. +Sign extension instructions work solely with the accumulator. Fortunately, there are also more universal instructions which copy and extend data at the same time. 
   * **movsx** - copies and sign-extends a byte to a word or doubleword or word to doubleword.   * **movsx** - copies and sign-extends a byte to a word or doubleword or word to doubleword.
   * **movzx** - copies and zero-extends a byte to a word or doubleword or word to doubleword.   * **movzx** - copies and zero-extends a byte to a word or doubleword or word to doubleword.
Line 285: Line 299:
  
 The **popcnt** instruction counts the number of bits equal to "1" in a data. The applications af this instruction include genome mining, handwriting recognition, digital health workloads, and fast hamming distance counts((https://patents.google.com/patent/US8214414)). The **popcnt** instruction counts the number of bits equal to "1" in a data. The applications af this instruction include genome mining, handwriting recognition, digital health workloads, and fast hamming distance counts((https://patents.google.com/patent/US8214414)).
 +
 +The **crc32** instruction implements the calculation of the cyclic redundancy check in hardware. The polynomial of the value 11EDC6F41h is fixed.
  
 ===== Control transfer instructions ===== ===== Control transfer instructions =====
Line 386: Line 402:
 ==== String compare ==== ==== String compare ====
 Strings can be compared, which means that the element of the destination string is compared with the element of the source string. These instructions set the status flags in the flags register according to the result of the comparison. The elements of both strings remain unchanged. Strings can be compared, which means that the element of the destination string is compared with the element of the source string. These instructions set the status flags in the flags register according to the result of the comparison. The elements of both strings remain unchanged.
-The **cmps** instruction compares the element of a source string with the element of the destination string. It requires one argument, which specifies the size of the accumulator and the data element.+The **cmps** instruction compares the element of a source string with the element of the destination string. It requires two arguments, which specify the size of the data elements.
 The **cmpsb** instruction compares a byte from the source string with a byte from the destination string. The **cmpsb** instruction compares a byte from the source string with a byte from the destination string.
 The **cmpsw** instruction compares a word from the source string with a word from the destination string. The **cmpsw** instruction compares a word from the source string with a word from the destination string.
Line 535: Line 551:
  
 The **lzcnt** instruction counts the number of zeros in an argument starting from the most significant bit. The **tzcnt** counts zeros starting from the least significant bit. For an argument that is not zero, **lzcnt** returns the number of zeros before the first 1 from the left, and **tzcnt** gives the number of zeros before the first 1 from the right.  The **lzcnt** instruction counts the number of zeros in an argument starting from the most significant bit. The **tzcnt** counts zeros starting from the least significant bit. For an argument that is not zero, **lzcnt** returns the number of zeros before the first 1 from the left, and **tzcnt** gives the number of zeros before the first 1 from the right. 
-The **bextr** instruction copies the number of bits from source to destination arguments starting at the chosen position. The third argument specifies the number of bits and the starting bit position. Bits 7:0 of the third operand specify the starting bit position, while bits 15:8 specify the maximum number of bits to extract. +The **bextr** instruction copies the number of bits from source to destination arguments starting at the chosen position. The third argument specifies the number of bits and the starting bit position. Bits 7:0 of the third operand specify the starting bit position, while bits 15:8 specify the maximum number of bits to extract, as shown in figure {{ref>bextr_instr}}.
  
-BEXTR Contiguous bitwise extract. +<figure bextr_instr> 
-BLSI Extract lowest set bit. +{{ :en:multiasm:cs:bextr.png?400 |Illustration of bit extraction instruction}} 
-BLSMSK Set all lower bits below first set bit to 1. +<caption>Illustration of bit extraction instruction</caption> 
-BLSR Reset lowest set bit. +</figure>
-BZHI Zero high bits starting from specified bit position.+
  
-PDEP Parallel deposit of bits using a mask. +The **blsi** instruction extracts the single, lowest bit set to one, as shown in figure {{ref>blsi_instr}}. 
-PEXT Parallel extraction of bits using a mask.+ 
 +<figure blsi_instr> 
 +{{ :en:multiasm:cs:blsi.png?400 |Illustration of the lowest set bit extraction instruction}} 
 +<caption>Illustration of lowest set bit extraction instruction</caption> 
 +</figure> 
 + 
 +The **blsmsk** instruction sets all lower bits below a first bit set to 1. It is shown in figure {{ref>blsmsk_instr}}. 
 + 
 +<figure blsmsk_instr> 
 +{{ :en:multiasm:cs:blsmsk.png?400 |Illustration of the instruction which sets all lower bits below a first bit set to 1.}} 
 +<caption>Illustration of the instruction which sets all lower bits below a first bit set to 1</caption> 
 +</figure> 
 + 
 +The **blsr** instruction resets (clears the bit to zero value) the lowest set bit. It is shown in figure {{ref>blsr_instr}}. 
 + 
 +<figure blsr_instr> 
 +{{ :en:multiasm:cs:blsr.png?400 |Illustration of the instruction which resets a first bit set to 1.}} 
 +<caption>Illustration of the instruction which resets a first bit set to 1</caption> 
 +</figure> 
 + 
 +The **bzhi** instruction resets high bits starting from the specified bit position, as shown in figure {{ref>bzhi_instr}}. 
 + 
 +<figure bzhi_instr> 
 +{{ :en:multiasm:cs:bzhi.png?400 |Illustration of the instruction which resets high bits starting from the specified bit position.}} 
 +<caption>Illustration of the instruction which resets high bits starting from the specified bit position</caption> 
 +</figure> 
 + 
 +The **pdep** instruction performs a parallel deposit of bits using a mask. Its behaviour is shown in figure {{ref>pdep_instr}}
 + 
 +<figure pdep_instr> 
 +{{ :en:multiasm:cs:pdep.png?600 |Illustration of the parallel deposit instruction}} 
 +<caption>Illustration of the parallel deposit instruction</caption> 
 +</figure> 
 + 
 +The **pext** instruction performs a parallel extraction of bits using a mask. Its behaviour is shown in figure {{ref>pext_instr}}. 
 + 
 +<figure pext_instr> 
 +{{ :en:multiasm:cs:pext.png?600 |Illustration of the parallel extraction instruction}} 
 +<caption>Illustration of the parallel extraction instruction</caption> 
 +</figure>
  
en/multiasm/papc/chapter_6_7.1761210631.txt.gz · Last modified: by ktokarz
CC Attribution-Share Alike 4.0 International
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0