Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
en:multiasm:papc:chapter_6_7 [2026/04/01 14:13] – [Instruction Set of x64 - Essentials] ktokarzen:multiasm:papc:chapter_6_7 [2026/06/22 13:06] (current) pczekalski
Line 1: Line 1:
 ====== Instruction Set of x64 - Essentials ====== ====== Instruction Set of x64 - Essentials ======
-The x64 processors can execute an extensive number of different instructions. As processors have evolved, the instruction set has expanded from the initial 117 in the 8086 processor to over 1000 in modern 64-bit designs. In this chapter, we present the instruction groups and a description of essential instructions called general-purpose instructions.+The x64 processors can execute a wide range of instructions. As processors have evolved, the instruction set has expanded from the initial 117 in the 8086 processor to over 1000 in modern 64-bit designs. In this chapter, we present the instruction groups and describe the essential general-purpose instructions.
  
 ===== Instruction groups ===== ===== Instruction groups =====
-In the documentation of processors, we can find several ways of dividing all instructions into groups. The most general division, according to AMD, defines five groups of instructions:+In the processor documentation, we can find several ways to group all instructions. The most general division, according to AMD, defines five groups of instructions:
   * General Purpose instructions   * General Purpose instructions
   * System instructions   * System instructions
Line 24: Line 24:
   * SMX Instructions   * SMX Instructions
  
-There is also a long list of extensions defined, including SSE4.1, SSE4.2, Intel AVX, AMD 3DNow! and many others. For a detailed description of instruction groups, please refer to +There is also a long list of extensions defined, including SSE4.1, SSE4.2, Intel AVX, AMD 3DNow! and many others. For a detailed description of instruction groups, please refer to: 
-  * "AMD64 Architecture Programmer’s Manual"((https://docs.amd.com/v/u/en-US/40332-PUB_4.08)), +  * "AMD64 Architecture Programmer’s Manual" ((https://docs.amd.com/v/u/en-US/40332-PUB_4.08)), 
-  * "Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 1: Basic Architecture"((https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html)).+  * "Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 1: Basic Architecture" ((https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html)).
  
-Details of every instruction you can find in the description of the instruction set  +Details of every instruction can be found in the description of the instruction set: 
-  * "AMD64 Architecture Programmer’s Manual Volume 3: General Purpose and System Instructions"((https://docs.amd.com/v/u/en-US/24594_3.37)),+  * "AMD64 Architecture Programmer’s Manual Volume 3: General Purpose and System Instructions" ((https://docs.amd.com/v/u/en-US/24594_3.37)),
   * "Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2 (2A, 2B, 2C, & 2D): Instruction Set Reference, A-Z" ((https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html)).    * "Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2 (2A, 2B, 2C, & 2D): Instruction Set Reference, A-Z" ((https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html)). 
  
-There are also specialised websites with detailed explanations of instructions that you can use to get a lot of additional information. Among others, you can visit: +There are also specialised websites with detailed instructions that you can use to find a lot of additional information. Among others, you can visit: 
   * X86 Opcode and Instruction Reference ((http://ref.x86asm.net/index.html)) by MazeGen,    * X86 Opcode and Instruction Reference ((http://ref.x86asm.net/index.html)) by MazeGen, 
   * x86 and amd64 instruction reference ((https://www.felixcloutier.com/x86/)) by Félix Cloutier.   * x86 and amd64 instruction reference ((https://www.felixcloutier.com/x86/)) by Félix Cloutier.
Line 58: Line 58:
  
 ===== Condition Codes ===== ===== Condition Codes =====
-Before describing instructions, let's present the condition codes. The condition code takes the form of a suffix to the instruction and influences its behaviour in such a way that if the condition is met, the instruction is executed; if the condition is not met, the processor moves on to the next instruction in the program. The condition that is checked during the execution of the conditional instruction is based on the current state of the flags in the EFLAGS register. The flags in the EFLAGS register are modified by instructions, mainly arithmetic, logical, shift, or special flag manipulation instructions. It is important to note that flags are not modified when copying data, so to check whether the value just read is zero, you should perform, for example, a comparison. +Before describing instructions, let's present the condition codes. The condition code is a suffix to the instruction and influences its behaviourif the condition is met, the instruction is executed; if not, the processor moves on to the next instruction in the program. The condition that is checked during the execution of the conditional instruction is based on the current state of the flags in the EFLAGS register. The flags in the EFLAGS register are modified by instructions, mainly arithmetic, logical, shift, or special flag manipulation instructions. It is important to note that flags are not modified when copying data, so to check whether the value just read is zero, you should perform, for example, a comparison. 
-Condition codes together with flags checked are presented in table {{ref>table_condition_codes}}.+Condition codestogether with checked flags, are presented in table {{ref>table_condition_codes}}.
  
 <table table_condition_codes> <table table_condition_codes>
Line 96: Line 96:
 </table> </table>
 ===== Data transfer instructions ===== ===== Data transfer instructions =====
-Almost all assembler tutorials start with the presentation of the **mov** instruction, which is used to copy data from the source operand to the destination operand. Our book is not an exception, and we've already shown this instruction in examples presented in previous sections. +Almost all assembler tutorials start with the presentation of the **mov** instruction, which is used to copy data from the source operand to the destination operand. Our book is no exception, and we've already shown this instruction in the examples presented in previous sections. 
 ==== MOV ==== ==== MOV ====
 Let's look at some additional variants. Let's look at some additional variants.
Line 105: Line 105:
 mov rax, rbx       ;copy quadword (eight bytes) from rbx to eax  mov rax, rbx       ;copy quadword (eight bytes) from rbx to eax 
 </code> </code>
-In the **mov** instruction, the size of the source argument must be the same as the size of the destination argument. Arguments can be stored in registersin memory addressed directly or indirectly. One of them can be constant (immediate). Only one memory argument is allowed. This comes from instructions encoding. In instructions, there is only one possible direct or indirect argument to be encoded. That's why most instructions, not only **mov**, can operate with one memory argument only. There are some exceptions, for example, string instructions, but such instructions use specific indirect addressing.+In the **mov** instruction, the size of the source argument must be the same as the size of the destination argument. Arguments can be stored in registers or in memoryaddressed directly or indirectly. One of them can be constant (immediate). Only one memory argument is allowed. This comes from the instructions encoding. In instructions, there is only one possible direct or indirect argument to be encoded. That's why most instructions, not only **mov**, can operate on a single memory argument. There are some exceptions, for example, string instructions, but such instructions use specific indirect addressing.
 <code asm> <code asm>
 mov al, 100        ;0xB0, 0x64 mov al, 100        ;0xB0, 0x64
Line 135: Line 135:
 </code> </code>
 ==== Conditional move ==== ==== Conditional move ====
-Starting from the P6 machines, the conditional move instruction **cmov//cc//** was introduced. This works similarly to **mov**, but copies data if the specified condition is true. The condition code is one of the codes presented in the section "Condition Codes". If the condition is false, the instruction simply passes through without modifying the arguments. Conditional move instructions can be used to avoid conditional jumps.+Starting from the P6 machines, the conditional move instruction **cmov//cc//** was introduced. This works similarly to **mov**, but copies data if the specified condition is true. The condition code is one of the codes listed in the "Condition Codes" section. If the condition is false, the instruction simply passes through without modifying the arguments. Conditional move instructions can be used to avoid conditional jumps.
 For example, if we need to copy data from ebx to ecx, if the result of the previous operation is negative, we can write the following instruction. For example, if we need to copy data from ebx to ecx, if the result of the previous operation is negative, we can write the following instruction.
 <code asm> <code asm>
Line 158: Line 158:
 </code> </code>
  
-It is visible that to preserve the original value, the upper bits must be filled with ones, not zeros.+It is clear thatto preserve the original value, the upper bits must be set to ones, not zeros.
 <code asm> <code asm>
               ;   ah      al               ;   ah      al
Line 166: Line 166:
 </code> </code>
  
-There are special instructions which perform automatic sign extension, copying the sign bit to all higher bit positions. They can be considered as type conversion instructions. These instructions do not have any arguments as they operate on the accumulator only.+There are special instructions which perform automatic sign extension, copying the sign bit to all higher bit positions. They can be considered as type conversion instructions. These instructions have no argumentsas they operate on the accumulator only.
   * **cbw** - converts byte in al to word in ax   * **cbw** - converts byte in al to word in ax
   * **cwd** - converts word in ax to doubleword in dx:ax   * **cwd** - converts word in ax to doubleword in dx:ax
Line 174: Line 174:
   * **cqo** - convert quadword in rax to double quadword in rdx:rax   * **cqo** - convert quadword in rax to double quadword in rdx:rax
  
-Sign extension instructions work solely with the accumulator. Fortunately, there are also more universal instructions which copy and extend data at the same time+Sign-extension instructions operate solely on the accumulator. Fortunately, there are also more general instructions that copy and extend data simultaneously
   * **movsx** - copies and sign-extends a byte to a word or doubleword or word to doubleword.   * **movsx** - copies and sign-extends a byte to a word or doubleword or word to doubleword.
   * **movzx** - copies and zero-extends a byte to a word or doubleword or word to doubleword.   * **movzx** - copies and zero-extends a byte to a word or doubleword or word to doubleword.
Line 180: Line 180:
  
 ==== Exchange instructions ==== ==== Exchange instructions ====
-The exchange instructions swap the values of operands. A single exchange instruction can replace three mov instructions while swapping the contents of two arguments, so they can be useful in optimising some algorithms. They are helpful in the implementation of semaphores, even in multiprocessor systems. +The exchange instruction swaps the values of the operands. A single exchange instruction can replace three mov instructions while swapping the contents of two arguments, so they can be useful in optimising some algorithms. They help implement semaphores, even in multiprocessor systems. 
-The **xchg** instruction swaps the values of two arguments. If one of the arguments is in memory, the instruction behaves as with the LOCK prefix, allowing for semaphore implementation. +The **xchg** instruction swaps the values of two arguments. If one of the arguments is in memory, the instruction behaves as if the LOCK prefix were present, allowing for semaphore implementation. 
-The **cmpxchg** has three arguments: source, destination and accumulator. It compares the destination argument with the accumulator; if they are equal, the destination argument value is replaced with the value from the source operand. It is used to test and modify semaphores. Its operation is presented in fig {{ref>instr_cmpxchg}}. In newer machines, the eight- and sixteen-byte versions were added: **cmpxchg8b** and **cmpxch16b**. They always use ECX:EBX or RCX:RBX as the source argument and EDX:EAX or RDX:RAX as the accumulator. The destination argument is in the memory.+The **cmpxchg** has three arguments: source, destination and accumulator. It compares the destination argument with the accumulator; if they are equal, the destination argument value is replaced with the value from the source operand. It is used to test and modify semaphores. Its operation is presented in Fig {{ref>instr_cmpxchg}}. In newer machines, the eight- and sixteen-byte versions were added: **cmpxchg8b** and **cmpxch16b**. They always use ECX:EBX or RCX:RBX as the source register pair and EDX:EAX or RDX:RAX as the accumulator pair. The destination argument is in the memory.
 <figure instr_cmpxchg> <figure instr_cmpxchg>
 {{ :en:multiasm:cs:cmpxchg.png?500 |Illustration of cmpxchg instruction}} {{ :en:multiasm:cs:cmpxchg.png?500 |Illustration of cmpxchg instruction}}
 <caption>Explanation of cmpxchg instruction</caption> <caption>Explanation of cmpxchg instruction</caption>
 </figure> </figure>
-The **xadd** instruction exchanges two arguments, adds them, and stores the sum in a destination argument. Together with a LOCK prefix, it can be used to implement a DO loop executed by more than one processor simultaneously.+The **xadd** instruction exchanges two arguments, adds them, and stores the sum in a destination argument. When combined with a LOCK prefix, it can be used to implement a DO loop executed by multiple processors simultaneously.
  
 The **bswap** instruction is a single-argument instruction; it changes the order of bytes in a 32- or 64-bit register. It can be used to convert little-endian data to big-endian representation and vice versa, as shown in figure {{ref>instr_bswap}}. The **bswap** instruction is a single-argument instruction; it changes the order of bytes in a 32- or 64-bit register. It can be used to convert little-endian data to big-endian representation and vice versa, as shown in figure {{ref>instr_bswap}}.
Line 195: Line 195:
 </figure> </figure>
 ==== Stack instructions ==== ==== Stack instructions ====
-A stack is a special structure in the memory that automatically stores the return address (address of the next instruction) while procedure calling (it is described in detail in the section about the **call** instruction). It is also possible to use the stack for local variables in functions, to pass arguments to procedures, and for temporal data storage. In x86 architecture, the stack is supported by hardware with the special stack pointer register. Instructions operating on the stack automatically modify the stack pointer in a way that it always points to the top of the stack. The **push** instruction decrements the stack pointer and places the data onto the stack. As a result, the stack pointer points to the last data on the stack. It is shown in figure {{ref>instr_push}}.+A stack is a special structure in the memory that automatically stores the return address (address of the next instruction) while procedure calling (it is described in detail in the section about the **call** instruction). It is also possible to use the stack for local variables in functions, for passing arguments to procedures, and for storing temporal data. In the x86 architecture, the stack is supported by hardware with the special stack pointer register. Instructions that operate on the stack automatically modify the stack pointer so that it always points to the top of the stack. The **push** instruction decrements the stack pointer and places the data onto the stack. As a result, the stack pointer points to the last data on the stack. It is shown in figure {{ref>instr_push}}.
 <figure instr_push> <figure instr_push>
 {{ :en:multiasm:cs:push.png?500 |Illustration of push instruction}} {{ :en:multiasm:cs:push.png?500 |Illustration of push instruction}}
Line 201: Line 201:
 </figure> </figure>
  
-The **pop** instruction takes data off the stack, copies it into the destination argument, and increments the stack pointer. After its execution, the stack pointer points to the previous data stored on the stack. It is shown in figure {{ref>instr_pop}}.+The **pop** instruction removes data from the stack, copies it into the destination register, and increments the stack pointer. After its execution, the stack pointer points to the previous data stored on the stack. It is shown in figure {{ref>instr_pop}}.
 <figure instr_pop> <figure instr_pop>
 {{ :en:multiasm:cs:pop.png?500 |Illustration of pop instruction}} {{ :en:multiasm:cs:pop.png?500 |Illustration of pop instruction}}
 <caption>Explanation of pop instruction</caption> <caption>Explanation of pop instruction</caption>
 </figure> </figure>
-There are also instructions that push or pop all eight general-purpose registers (including the stack pointer). The 16-bit registers are pushed with **pusha** and popped with **popa** instructions. For 32-bit registers, the **pushad** and **popad** instructions can be used, respectively. The order of registers on the stack is shown in figure {{ref>instr_pushadpopad}}. These instructions are not supported in 64-bit mode.+There are also instructions that push or pop all eight general-purpose registers (including the stack pointer). The 16-bit registers are pushed with the **pusha** instruction and popped with the **popa** instruction. For 32-bit registers, the **pushad** and **popad** instructions can be used, respectively. The order of registers on the stack is shown in figure {{ref>instr_pushadpopad}}. These instructions are not supported in 64-bit mode.
 <figure instr_pushadpopad> <figure instr_pushadpopad>
 {{ :en:multiasm:cs:pushadpopad.png?500 |Illustration of pushad and popad instructions}} {{ :en:multiasm:cs:pushadpopad.png?500 |Illustration of pushad and popad instructions}}
Line 213: Line 213:
  
 ===== Arithmetic instructions ===== ===== Arithmetic instructions =====
-Arithmetic instructions perform calculations on binary encoded data. It is worth noting that the processor does not distinguish between unsigned and signed values; it is the responsibility of the programming engineer to provide correct input values and properly interpret the results obtained.+Arithmetic instructions perform calculations on binary encoded data. It is worth noting that the processor does not distinguish between unsigned and signed values; it is the responsibility of the programming engineer to provide correct input values and to interpret the results properly.
 <note> <note>
-There are instructions which support decimal arithmetic, but due to the rare use of BCD numbers in modern software, they are not available in x64 mode.+Some instructions support decimal arithmetic, but because BCD numbers are rarely used in modern software, they are not available in x64 mode.
 </note> </note>
 ==== Addition and subtraction ==== ==== Addition and subtraction ====
-There are two adding instructions. The **add** adds two values from the destination and source arguments and stores the result in the destination argument. It modifies the flags in the EFLAG register according to the result. The **adc** instruction additionally adds "1" if the carry flag (CF) is set. It allows the processor to calculate the sum of the values bigger than can be encoded in a register (for example, 128-bit integers in a 64-bit processor). +There are two adding instructions. The **add** adds two values from the destination and source arguments and stores the result in the destination argument. It modifies the flags in the EFLAG register according to the result. The **adc** instruction additionally adds "1" if the carry flag (CF) is set. It allows the processor to calculate the sum of values larger than can be encoded in a register (for example, 128-bit integers on a 64-bit processor). 
-Similarly, there are two subtraction instructions. The **sub** subtracts the source argument from the destination argument, stores the result in the destination, and modifies the flags according to the result. The **sbb** instruction calculates the difference of arguments minus "1" if the CF flag is set (here, CF plays the role of the borrow flag).+Similarly, there are two subtraction instructions. The **sub** subtracts the source argument from the destination argument, stores the result in the destination, and sets the flags accordingly. The **sbb** instruction calculates the difference of arguments minus "1" if the CF flag is set (here, CF plays the role of the borrow flag).
  
 ==== Incrementation and decrementation ==== ==== Incrementation and decrementation ====
-The **inc** instruction adds "1" to, and **dec** instruction subtracts "1" from the argument. The argument is treated as an unsigned integer.+The **inc** instruction adds "1" to, and the **dec** instruction subtracts "1" from the argument. The argument is treated as an unsigned integer.
  
 ==== Multiply ==== ==== Multiply ====
-Two multiply instructions are implemented. The **mul** is a one-argument instruction. It multiplies the content of the argument and the accumulator, treated as unsigned numbers. The size of the accumulator corresponds to the size of the argument. The result is stored in the accumulator. As the multiplication can give the result even twice as big as the input values, it is stored in a bigger accumulator size, as shown in the table {{ref>table_mul}}.+Two multiply instructions are implemented. The **mul** is a one-argument instruction. It multiplies the contents of the argument and the accumulator, treating them as unsigned numbers. The size of the accumulator corresponds to the size of the argument. The result is stored in the accumulator. Since multiplication can produce results twice as large as the input values, it is stored in a larger accumulator size, as shown in the table {{ref>table_mul}}.
  
 <table table_mul> <table table_mul>
Line 236: Line 236:
 </table> </table>
  
-The **imul** instruction implements the signed multiply. It can have one, two or three arguments. The single-argument version behaves the same way as the **mul** instruction. The two-argument version multiplies the 16-, 32-, or 64-bit register as the destination operand by the argument of the same size. The three-argument version multiplies the content of the source argument by the immediate and stores the result in the destination of the same size as the arguments. The destination must be the register.+The **imul** instruction performs signed multiplication. It can have one, two or three arguments. The single-argument version behaves the same way as the **mul** instruction. The two-argument version multiplies the 16-, 32-, or 64-bit register as the destination operand by the argument of the same size. The three-argument version multiplies the source argument by the immediate and stores the result in the destination, which is the same size as the arguments. The destination must be the register.
  
 ==== Divide ==== ==== Divide ====
-Two divide instructions are implemented. The **div** is a one-argument instruction. It divides the content of the accumulator by the argument, treated as unsigned numbers. The size of the accumulator is twice as big as the size of the argument. The result is stored as two integer values of the same size as the argument. The quotient is placed in the lower half of the accumulator, and the remainder in the higher half of the accumulator. Depending on the size of the argument, the accumulator is understood as a pair of registers DX:AX, EDX:EAX or RDX:RAX, as shown in the table {{ref>table_div}}.+Two divide instructions are implemented. The **div** is a one-argument instruction. It divides the accumulator's contents by the argument, treating both as unsigned numbers. The size of the accumulator is twice as big as the size of the argument. The result is stored as two integer values of the same size as the argument. The quotient is placed in the lower half of the accumulator, and the remainder in the higher half of the accumulator. Depending on the size of the argument, the accumulator is understood as a pair of registers DX:AX, EDX:EAX or RDX:RAX, as shown in the table {{ref>table_div}}.
  
 <table table_div> <table table_div>
Line 250: Line 250:
 </table> </table>
  
-The **idiv** instruction implements the signed divide. It behaves the same way as the **div** instruction except for the type of numbers.+The **idiv** instruction performs signed division. It behaves the same way as the **div** instruction except for the type of numbers.
 ===== Logical instructions ===== ===== Logical instructions =====
 The set of logical instructions contains **and**, **or**, **xor** and **not** instructions. All of them perform bitwise Boolean operations corresponding to their names. The **not** is a single-argument instruction; others have two arguments. The set of logical instructions contains **and**, **or**, **xor** and **not** instructions. All of them perform bitwise Boolean operations corresponding to their names. The **not** is a single-argument instruction; others have two arguments.
  
 ===== Shift and rotate instructions ===== ===== Shift and rotate instructions =====
-Shift and rotate instructions treat the argument as the shift register. Each bit of the argument is moved to the neighbour position on the left or right, depending on the shift direction. The number of bit positions for the shift can be specified as a constant or in the CX register. Shift instructions can be used for multiplying (shift left) and dividing (shift right) by a power of two.+Shift and rotate instructions treat the argument as the shift register. Each bit of the argument is moved to the neighbouring position on the left or right, depending on the direction of the shift. The number of bit positions for the shift can be specified as a constant or in the CX register. Shift instructions can be used to multiply (shift left) and divide (shift right) by powers of two.
 Shift instructions have two versions: logical and arithmetical. Logical shift left **shl** and arithmetical shift left **sal** behave the same, filling the empty bits (at the LSB position) with zeros. Logical shift right **shr** fills the empty bits (at the MSB position) with zeros, while the arithmetical shift right **sar** makes a copy of the most significant bit, preserving the sign of a value. It is shown in figure {{ref>instr_shift}}. Shift instructions have two versions: logical and arithmetical. Logical shift left **shl** and arithmetical shift left **sal** behave the same, filling the empty bits (at the LSB position) with zeros. Logical shift right **shr** fills the empty bits (at the MSB position) with zeros, while the arithmetical shift right **sar** makes a copy of the most significant bit, preserving the sign of a value. It is shown in figure {{ref>instr_shift}}.
  
Line 263: Line 263:
 </figure> </figure>
  
-There are two double shift instructions which move bits from the source argument to the destination argument. The number of bits is specified as the third argument. Shift double right has **shrd** mnemonic, while shift double left has **shld** mnemonic. The operation of shift double instructions is presented in figure {{ref>instr_shiftdouble}}.+There are two double-shift instructions that move bits from the source argument to the destination argument. The number of bits is specified as the third argument. Shift double right has **shrd** mnemonic, while shift double left has **shld** mnemonic. The operation of shift double instructions is presented in figure {{ref>instr_shiftdouble}}.
  
 <figure instr_shiftdouble> <figure instr_shiftdouble>
Line 284: Line 284:
 </figure> </figure>
 ===== Bit and Byte Instructions ===== ===== Bit and Byte Instructions =====
-Bit test instruction **bt** makes a copy of the selected bit in the carry flag. The bit for testing is specified by a combination of two arguments. The first argument, named the bit base operand, holds the bit. It can be a register or a memory location. The second operand is the bit offset, which specifies the position of the bit operand. It can be a register or an immediate value. It starts counting from 0, so the least significant bit has the position 0. An example of the behaviour of the **bt** instruction is shown in figure {{ref>instr_bt}}.+Bit test instruction **bt** makes a copy of the selected bit in the carry flag. combination of two arguments specifies the bit for testing. The first argument, called the bit-base operand, holds the bit. It can be a register or a memory location. The second operand is the bit offset, which specifies the position of the bit operand. It can be a register or an immediate value. It starts counting from 0, so the least significant bit has the position 0. An example of the behaviour of the **bt** instruction is shown in figure {{ref>instr_bt}}.
 <figure instr_bt> <figure instr_bt>
 {{ :en:multiasm:cs:bt14.png?600 |Illustration of bit test instruction}} {{ :en:multiasm:cs:bt14.png?600 |Illustration of bit test instruction}}
Line 292: Line 292:
 Bit test and modify instructions first make a copy of the selected bit, and next modify the original bit value with the one specified by the instruction. The **bts** sets the bit to one, **btr** clears the bit (resets to zero value), **btc** changes the state of the bit to the opposite (complements). Bit test and modify instructions first make a copy of the selected bit, and next modify the original bit value with the one specified by the instruction. The **bts** sets the bit to one, **btr** clears the bit (resets to zero value), **btc** changes the state of the bit to the opposite (complements).
  
-The bit scan instructions search for the first occurrence of the bit of the value 1. The bit scan forward **bsf** scans starting from the least significant bit towards higher bits, bit scan reverse **bsr** starts from the most significant bit towards lower bits. Both instructions return the index of the found bit in the destination register. If there is no bit of the value 1, the zero flag is set, and the destination register value is undefined.+The bit scan instructions search for the first occurrence of bit set to 1. The bit scan forward **bsf** scans from the least significant bit towards higher bits, and bit scan reverse **bsr** scans from the most significant bit towards lower bits. Both instructions return the index of the found bit in the destination register. If there is no bit of the value 1, the zero flag is set, and the destination register value is undefined.
  
-The **test** instruction performs the logical AND function without storing the result. It just modifies flags according to the result of the AND operation.+The **test** instruction performs the logical AND function without storing the result. It just modifies the flags based on the result of the AND operation.
  
-The **set//cc//** instruction sets the argument to 1 if the chosen condition is met, or clears the argument if the condition is not met. The condition can be freely chosen from the set of conditions available for other instructions, for example, **cmov//cc//**. This instruction is useful to convert the result of the operation into the Boolean representation.+The **set//cc//** instruction sets the argument to 1 if the chosen condition is met, or clears the argument if the condition is not met. The condition can be freely chosen from the set of conditions available for other instructions, for example, **cmov//cc//**. This instruction is useful for converting the result of the operation into Boolean representation.
  
 The **popcnt** instruction counts the number of bits equal to "1" in a data. The applications af this instruction include genome mining, handwriting recognition, digital health workloads, and fast hamming distance counts((https://patents.google.com/patent/US8214414)). The **popcnt** instruction counts the number of bits equal to "1" in a data. The applications af this instruction include genome mining, handwriting recognition, digital health workloads, and fast hamming distance counts((https://patents.google.com/patent/US8214414)).
  
-The **crc32** instruction implements the calculation of the cyclic redundancy check in hardware. The polynomial of the value 11EDC6F41h is fixed.+The **crc32** instruction implements cyclic redundancy check (CRC) computation in hardware. The polynomial of the value 11EDC6F41h is fixed.
  
 ===== Control transfer instructions ===== ===== Control transfer instructions =====
-Before describing the instructions used for control transfer, we will discuss how the destination address can be calculated. The destination address is the address given to the processor to make a jump to. +Before describing the instructions used for control transfer, we will discuss how to calculate the destination address. The destination address is the address the processor jumps to. 
 ==== Near and far transfer ==== ==== Near and far transfer ====
-While the segmentation is enabled, the destination address can be given as the offset only or in full logical form. If there is an offset only, the instruction modifies solely the instruction pointer, the jump is performed within the current segment and is called **near**. If the address is provided in full logical form, containing segment and offset parts, the CS and IP registers are modified. Such an instruction can perform a jump between segments and is called **far**.+While segmentation is enabled, the destination address can be specified either as an offset or in full logical form. If there is an offset only, the instruction modifies solely the instruction pointer, the jump is performed within the current segment and is called **near**. If the address is provided in full logical form, containing segment and offset parts, the CS and IP registers are modified. Such an instruction can perform a jump between segments and is called **far**.
 ==== Absolute and relative address ==== ==== Absolute and relative address ====
 An **absolute address** is given as a value specifying the destination address as the number of the byte counted from the beginning of the memory, or, if segmentation is enabled, as the offset from the beginning of the segment. A **relative address** is calculated as the difference between the current value of the instruction pointer and the absolute destination address. It is provided in the instructions as the signed number representing the distance between the current and destination addresses. If it is possible to encode the difference as an 8-bit signed value, the jump is called **short**. Usually, an assembler automatically chooses the shortest possible encoding. An **absolute address** is given as a value specifying the destination address as the number of the byte counted from the beginning of the memory, or, if segmentation is enabled, as the offset from the beginning of the segment. A **relative address** is calculated as the difference between the current value of the instruction pointer and the absolute destination address. It is provided in the instructions as the signed number representing the distance between the current and destination addresses. If it is possible to encode the difference as an 8-bit signed value, the jump is called **short**. Usually, an assembler automatically chooses the shortest possible encoding.
 ==== Conditional and unconditional control transfer ==== ==== Conditional and unconditional control transfer ====
-Conditional transfer instructions check the state of chosen flags in the Flags register and perform the jump to the specified address if the condition gives a true result. If the condition results in false, the processor goes to the next instruction in the instruction stream. Conditions are specified the same way as in **cmov//cc//** instruction as the suffix to the main mnemonic. Unconditional transfer instructions are always executed the same way. They jump to the specified address without any condition checking.+Conditional transfer instructions check the state of the chosen flags in the Flags register and perform jump to the specified address if the condition evaluates to true. If the condition evaluates to false, the processor proceeds to the next instruction in the instruction stream. Conditions are specified the same way as in the **cmov//cc//** instruction as the suffix to the main mnemonic. Unconditional transfer instructions are always executed the same way. They jump to the specified address without any condition checking.
 ==== Unconditional control transfer instructions ==== ==== Unconditional control transfer instructions ====
-Unconditional control transfer instructions perform the jump to the new address to change the program flow.  +Unconditional control-transfer instructions perform jump to new address, changing the program flow.  
-The **jmp** instruction jumps to a destination address by putting the destination address in the instruction pointer register. If segmentation is enabled and the destination address is placed in another segment than the current one, it also modifies the CS register. +The **jmp** instruction jumps to a destination address by putting the destination address in the instruction pointer register. If segmentation is enabled and the destination address is in a different segment than the current one, it also modifies the CS register. 
-The **call** instruction is designed to handle subroutines. It also jumps to a destination address, but before putting the new value into the instruction pointer, it pushes the returning address onto the stack. The returning address is the address of the next instruction after the call. This allows the processor to use the returning address later to get back from the subroutine to the main program. +The **call** instruction is designed to handle subroutines. It also jumps to a destination address, but before setting the instruction pointer to the new value, it pushes the return address onto the stack. The returning address is the address of the next instruction after the call. This allows the processor to use the return address later to return from the subroutine to the main program. 
-The **ret** instruction forms a pair with the **call**. It uses the information stored on the stack to return from a subroutine.+The **ret** instruction pairs with the **call**. It uses the information stored on the stack to return from a subroutine.
 The process of calling a procedure and returning to the main program is shown in figure {{ref>procedure_call}}. The process of calling a procedure and returning to the main program is shown in figure {{ref>procedure_call}}.
 <figure procedure_call> <figure procedure_call>
Line 324: Line 324:
 </note> </note>
 ==== Interrupts ==== ==== Interrupts ====
-An interrupt mechanism in x86 works with hardware-signalled interrupts or with special interrupt instructions. Return from an interrupt is performed by executing the **iret** instruction. In 32 and 64-bit architectures, the mnemonic for this instruction is **iretd**. The **iret** instruction differs from the **ret** instruction with popping of the stack not only the return address but also the content of the Flags register. This keeps the content of this register unmodified after returnand additionally prevents unintentional blocking following interrupts. +An interrupt mechanism in x86 works with hardware-signalled interrupts or with special interrupt instructions. Return from an interrupt is performed by executing the **iret** instruction. In 32 and 64-bit architectures, the mnemonic for this instruction is **iretd**. The **iret** instruction differs from the **ret** instruction in that it pops not only the return address but also the contents of the Flags register from the stack. This keeps the content of this register unmodified upon return and prevents unintentional blocking after interrupts. 
-The process of interrupt handler calling and returning to the main program is shown in figure {{ref>interrupt_x86}}.+The process of an interrupt handler being called and returning to the main program is shown in figure {{ref>interrupt_x86}}.
 <figure interrupt_x86> <figure interrupt_x86>
 {{ :en:multiasm:cs:interrupt_x86.png?550 |Illustration of interrupt signalling and return from the handler}} {{ :en:multiasm:cs:interrupt_x86.png?550 |Illustration of interrupt signalling and return from the handler}}
 <caption>Illustration of interrupt signalling and return from the handler</caption> <caption>Illustration of interrupt signalling and return from the handler</caption>
 </figure> </figure>
-Software interrupts are handled the same way as signalled by the hardware. The **int** instruction signals the interrupt of a given number. There are also some special interrupt instructions. The **int1** and **int3** are one-byte special machine codes used for debugging, **into** signals a software overflow exception if the OF flag is set, and **bound** raises the bound range exceeded exception (int 5) when the tested value is over or under the defined bounds. The last two instructions are not valid in 64-bit mode.+Software interrupts are handled the same way as hardware-signalled interrupts. The **int** instruction signals the interrupt of a given number. There are also some special interrupt instructions. The **int1** and **int3** are one-byte special machine codes used for debugging, **into** signals a software overflow exception if the OF flag is set, and **bound** raises the bound range exceeded exception (int 5) when the tested value is over or under the defined bounds. The last two instructions are not valid in 64-bit mode.
  
 <note> <note>
-In 32 and 64-bit operating systems, the interrupts are handled by the OS and called through the interrupt descriptors, called gates.+In 32and 64-bit operating systems, interrupts are handled by the OS and accessed through interrupt descriptors, also called gates.
 </note> </note>
  
 ==== Conditional control transfer instructions ==== ==== Conditional control transfer instructions ====
-The **j//cc//** instructions are used to test the state of flags and perform the jump to the destination address if the condition is met. In modern pipelined processors, it is recommended to avoid using conditional jumps if possible, ensuring that the program flows continuouslywithout the need to invalidate the pipeline. It is important to remember that flags are modified as a result of executing the arithmetic or logic instruction, but not the **mov** instruction. For example, if we need to test if some variable is zero, we can write such code:+The **j//cc//** instructions are used to test the state of the flags and to perform jump to the destination address if the condition is met. In modern pipelined processors, it is recommended to avoid conditional jumps whenever possible to ensure the program flows continuously without invalidating the pipeline. It is important to remember that flags are modified as a result of executing the arithmetic or logic instruction, but not the **mov** instruction. For example, if we need to test if some variable is zero, we can write such code:
 <code asm> <code asm>
 cmp var1, 0     ;compare variable cmp var1, 0     ;compare variable
Line 353: Line 353:
 ==== Loop instructions ==== ==== Loop instructions ====
 The **loop** instruction is used to implement a loop, which is executed a known number of times. The number of iterations should be set before a loop in the counter register (CX/ECX/RCX). The **loop** instruction automatically decrements the counter register, checks if it reaches zero and if not jumps to the address, which is the argument of the instruction and is assumed as the beginning address of a loop. If the counter reaches zero, the **loop** instruction goes further to the next instruction in a stream. The **loop** instruction is used to implement a loop, which is executed a known number of times. The number of iterations should be set before a loop in the counter register (CX/ECX/RCX). The **loop** instruction automatically decrements the counter register, checks if it reaches zero and if not jumps to the address, which is the argument of the instruction and is assumed as the beginning address of a loop. If the counter reaches zero, the **loop** instruction goes further to the next instruction in a stream.
-There are also conditional versions of the **loop** instruction, which allow finishing the iteration process before the counter reaches zero. The **loope** or **loopz** instructions continue the iteration if the counter is above zero and the zero flag (ZF) is set. The **loopne** or **loopnz** continue iteration if the counter is above zero and the zero flag (ZF) is cleared. +There are also conditional versions of the **loop** instruction that allow the iteration to finish before the counter reaches zero. The **loope** or **loopz** instructions continue the iteration if the counter is above zero and the zero flag (ZF) is set. The **loopne** or **loopnz** continue the iteration if the counter is greater than zero and the zero flag (ZF) is cleared. 
-The **loop** instruction can cause the system to iterate many times if the counter register is zero before entering the loop. As the first step is the decrementing of the counter, it will result in a value composed of all "1". For CX, the loop will be executed 65536 timesfor ECX more than 4 billion times and for RCX 184 quintillion 466 quadrillion 744 trillion 73 billion 709 million 551 thousand and 616 times! Understandably, we should avoid such a situation. The **jcxz**, **jecxz** and **jrcxz** instructions can help to jump over the entire loop if the counter register is zero at the beginning, as in the following code.+The **loop** instruction can cause the system to iterate many times if the counter register is zero before entering the loop. Since the first step is to decrement the counter, the result will be a value composed entirely of "1s". For CX, the loop will be executed 65536 timesfor ECXmore than 4 billion timesand for RCX184 quintillion 466 quadrillion 744 trillion 73 billion 709 million 551 thousand and 616 times! Understandably, we should avoid such a situation. The **jcxz**, **jecxz** and **jrcxz** instructions can help to jump over the entire loop if the counter register is zero at the beginning, as in the following code.
  
 <code asm> <code asm>
Line 369: Line 369:
  
 <note> <note>
-According to the information found on the Internet, the **loop** instructions are not optimised for modern pipelined processorsand are often replaced with compare and conditional jump instructions.+According to information found on the Internet, the **loop** instructions are not optimised for modern pipelined processors and are often replaced with compare-and-conditional-jump instructions.
 </note> </note>
  
 ===== String Instructions ===== ===== String Instructions =====
-String instructions are developed to perform operations on elements of data tables, including text strings. These instructions can access two elements in memory source and destination. If segmentation is enabled, the source operand is identified with SI/ESI and placed always in the data segment (DS), the destination operand is identified with DI/EDI and stored in the extended data segment (ES). In 64-bit mode, the source operand is identified with RSI, and the destination operand is identified with RDI. They can operate on bytes, words, doublewords or quadwords. The size of the element is specified as the suffix of the instruction or derived from the size of the arguments specified in the instruction. +String instructions are developed to perform operations on elements of data tables, including text strings. These instructions can access two memory locations: the source and the destination. If segmentation is enabled, the source operand is identified with SI/ESI and is always placed in the data segment (DS), while the destination operand is identified with DI/EDI and is stored in the extended data segment (ES). In 64-bit mode, the source operand is identified with RSI, and the destination operand is identified with RDI. They can operate on bytes, words, double words, or quad words. The size of the element is specified as the suffix of the instruction or derived from the size of the arguments specified in the instruction. 
  
 ==== String copy ==== ==== String copy ====
-The **movs** instruction copies the element of the source string to the destination string. It requires two arguments of the size of bytes, words, doublewords or quadwords.+The **movs** instruction copies the element of the source string to the destination string. It requires two arguments of the same sizebytes, words, doublewordsor quadwords.
 The **movsb** instruction copies a byte from the source string to the destination string. The **movsb** instruction copies a byte from the source string to the destination string.
 The **movsw** instruction copies a word from the source string to the destination string. The **movsw** instruction copies a word from the source string to the destination string.
Line 382: Line 382:
 The **movsq** instruction copies a quadword from the source string to the destination string. The **movsq** instruction copies a quadword from the source string to the destination string.
 <note> <note>
-The locations of the source and destination operands are always accessed with the use of the source and destination index registers, which must be loaded correctly before the string instruction is executed. Arguments, if present, are used to determine the size of the element only.+The locations of the source and destination operands are always accessed using the source and destination index registers, which must be loaded correctly before the string instruction is executed. Arguments, if present, are used to determine the size of the element only.
 </note> </note>
  
 ==== Store string ==== ==== Store string ====
-These instructions store the content of the accumulator to the destination operand.+These instructions store the accumulator's contents into the destination operand.
 The **stos** instruction copies the content of the accumulator to the destination string. It requires one argument of the size of byte, word, doubleword or quadword. The **stos** instruction copies the content of the accumulator to the destination string. It requires one argument of the size of byte, word, doubleword or quadword.
 The **stosb** instruction copies a byte from the AL to the destination string. The **stosb** instruction copies a byte from the AL to the destination string.
Line 402: Line 402:
 ==== String compare ==== ==== String compare ====
 Strings can be compared, which means that the element of the destination string is compared with the element of the source string. These instructions set the status flags in the flags register according to the result of the comparison. The elements of both strings remain unchanged. Strings can be compared, which means that the element of the destination string is compared with the element of the source string. These instructions set the status flags in the flags register according to the result of the comparison. The elements of both strings remain unchanged.
-The **cmps** instruction compares the element of a source string with the element of the destination string. It requires two arguments, which specify the size of the data elements.+The **cmps** instruction compares the element of a source string with the element of the destination string. It requires two arguments that specify the sizes of the data elements.
 The **cmpsb** instruction compares a byte from the source string with a byte from the destination string. The **cmpsb** instruction compares a byte from the source string with a byte from the destination string.
 The **cmpsw** instruction compares a word from the source string with a word from the destination string. The **cmpsw** instruction compares a word from the source string with a word from the destination string.
Line 417: Line 417:
  
 ==== Repeated string instructions ==== ==== Repeated string instructions ====
-All string instructions can be preceded by the repetition prefix to automate the processing of multiple-element tables. Use of the prefix enables the instructions to automatically repeat the instruction execution according to the content of the counter register and modify the source and destination addresses in index registers, accordingly to the size of the element. Index registers can be incremented or decremented depending on the direction flag (DF) state. If DF is "0", the addresses are incremented; if DF is "1" addresses are decremented. While the string element's size is a byte, the addresses are modified by 1. For words, the addresses are modified by 2, for doublewords by 4, and for quadwords by 8.+All string instructions can be preceded by the repetition prefix to automate the processing of multiple-element tables. Use of the prefix enables the instruction to automatically repeat execution according to the content of the counter register and modify the source and destination addresses in the index registers according to the element size. Index registers can be incremented or decremented depending on the direction flag (DF) state. If DF is "0", the addresses are incremented; if DF is "1", the addresses are decremented. While the string element's size is a byte, the addresses are modified by 1. For words, the addresses are modified by 2, for doublewords by 4, and for quadwords by 8.
 The **rep** prefix allows block copying, storing and loading of an entire string rather than a single element. The **rep** prefix allows block copying, storing and loading of an entire string rather than a single element.
-The use of repeated string instructions enables copying the entire string from one place in memory to anotheror filling up the memory regions with a pattern.+The use of repeated string instructions enables copying an entire string from one place in memory to another or filling memory regions with a pattern.
  
 The **repe** or **repz** prefixes additionally test if the zero flag is "1", to finish prematurely the process of string scan or comparison.  The **repe** or **repz** prefixes additionally test if the zero flag is "1", to finish prematurely the process of string scan or comparison. 
-The **repne** or **repnz** prefixes test if the zero flag is "0" to stop the iteration throughout the string. +The **repne** or **repnz** prefixes test whether the zero flag is "0" to stop iteration through the string. 
 The conditional prefixes are intended to be used with **scas** or **cmps** instructions. The conditional prefixes are intended to be used with **scas** or **cmps** instructions.
-The use of repeated string instructions with conditional prefixes enables string comparison for equality or differences, or to find the element in a string.+The use of repeated string instructions with conditional prefixes enables string comparisons for equality or difference, or to find an element in a string.
  
 To properly use the repeated string instructions, follow these steps: To properly use the repeated string instructions, follow these steps:
Line 434: Line 434:
 ===== I/O Instructions ===== ===== I/O Instructions =====
 These instructions allow the processor to transfer data between the accumulator register and a peripheral device. These instructions allow the processor to transfer data between the accumulator register and a peripheral device.
-A peripheral device can be addressed directly or indirectly. Direct addressing uses an 8-bit constant as the peripheral address (named in x86 I/O port), and it accesses only the first 256 port addresses. Indirect addressing uses the DX register as the address register, enabling access to the entire I/O address space of 65536 addresses. +A peripheral device can be addressed directly or indirectly. Direct addressing uses an 8-bit constant as the peripheral address (also called an I/O port in x86), and it accesses only the first 256 port addresses. Indirect addressing uses the DX register as the address register, enabling access to the entire I/O address space of 65536 addresses. 
-The **in** instruction reads data from a port to the accumulator. The **out** instruction writes the data from the accumulator to the port. The size of the accumulator determines the size of the data to be transferred. It can be AL, AX or EAX.+The **in** instruction reads data from a port to the accumulator. The **out** instruction writes the data from the accumulator to the port. The accumulator size determines the amount of data to be transferred. It can be AL, AX or EAX.
 The I/O instructions also have string versions. Instructions to read the port to a string are **ins**, **insb**, **insw**, and **insd**. Instructions to write a string to a port are **outs**, **outsb**, **outsw**, and **outsd**. In all string I/O instructions, the port is addressed with the DX register. Rules for addressing the memory are the same as in string instructions. The I/O instructions also have string versions. Instructions to read the port to a string are **ins**, **insb**, **insw**, and **insd**. Instructions to write a string to a port are **outs**, **outsb**, **outsw**, and **outsd**. In all string I/O instructions, the port is addressed with the DX register. Rules for addressing the memory are the same as in string instructions.
 ===== Enter and Leave Instructions ===== ===== Enter and Leave Instructions =====
-Enter instruction creates the stack frame for the function. The stack frame is a place on the stack reserved for the function to store arguments and local variables. Traditionally, we access the stack frame with the use of the RBP register, but we need to preserve its content before use. The **enter** instruction can be nested or non-nested. Not-nested saves the RBP on the stack, copies the stack pointer value to RBP, and adjusts the stack pointer with the constant value, which is the first operand of the instruction. After these steps, the RSP points to the top of the stack frame, and the RBP points to the stack base. The nested version creates the path to the higher-level functions' stack frames by adding their momentary value of RBP. The **leave** instruction reverses what **enter** did at the end of the function. The **enter** should be placed at the very beginning of the function, while the **leave** just before **ret**.+Enter instruction creates the stack frame for the function. The stack frame is a region of the stack reserved for function to store arguments and local variables. Traditionally, we access the stack frame using the RBP register, but we need to preserve its contents before use. The **enter** instruction can be nested or non-nested. Not-nested saves the RBP on the stack, copies the stack pointer value to RBP, and adjusts the stack pointer with the constant value, which is the first operand of the instruction. After these steps, the RSP points to the top of the stack frame, and the RBP points to the stack base. The nested version creates the path to the higher-level functions' stack frames by adding their momentary value of RBP. The **leave** instruction reverses what **enter** did at the end of the function. The **enter** should be placed at the very beginning of the function, while the **leave** just before **ret**.
 <note> <note>
-According to the information on compiler behaviour, the **enter** instruction is never used by compilers, while the **leave** instruction is rarely, but sometimes used.+According to information on compiler behaviour, the **enter** instruction is never used by compilers, while the **leave** instruction is rarely used.
 </note> </note>
 ===== Flag Control Instructions ===== ===== Flag Control Instructions =====
-Flag control instructions are typically used to set or clear the chosen flag in the RFLAGS register. We can only control three flags directly. The carry (CF) flag can be used in conjunction with the rotate-with-carry instructions to convert the series of bits into a binary-encoded value. The direction (DF) flag determines the direction of modification of index registers RSI and RDI when executing string instructions. If the DF flag is clear, the index registers are incremented; if the DF flag is set, the registers are decremented after each iteration of a string instruction. The interrupt (IF) flag enables or disables hardware interrupts. If the IF flag is set, the hardware interrupts are enabled; if the IF flag is clear, hardware interrupts are masked.+Flag control instructions are typically used to set or clear the chosen flag in the RFLAGS register. We can only control three flags directly. The carry (CF) flag can be used in conjunction with the rotate-with-carry instructions to convert the series of bits into a binary-encoded value. The direction (DF) flag determines the direction in which index registers RSI and RDI are modified when executing string instructions. If the DF flag is clear, the index registers are incremented; if the DF flag is set, the registers are decremented after each iteration of a string instruction. The interrupt (IF) flag enables or disables hardware interrupts. If the IF flag is set, the hardware interrupts are enabled; if the IF flag is clear, hardware interrupts are masked.
 The summary of instructions is shown in the table {{ref>table_flags_instructions}}. The summary of instructions is shown in the table {{ref>table_flags_instructions}}.
 <table table_flags_instructions> <table table_flags_instructions>
Line 463: Line 463:
  
 ===== Segment Register Instructions ===== ===== Segment Register Instructions =====
-Segment register instructions are used to load a far pointer to a pair of registers. One of the pair is the segment, which is determined by the instruction; another is the offset and appears as the destination argument. The source argument is the far pointer stored in the memory. These instructions include **lds** – load far pointer using DS, **les** – load far pointer using ES, **lfs** – load far pointer using FS, **lgs** – load far pointer using GS, and **lss** – load far pointer using SS.  +Segment register instructions load a far pointer into a pair of registers. One of the pair is the segment, which is determined by the instruction; the other is the offset, which appears as the destination argument. The source argument is the far pointer stored in the memory. These instructions include **lds** – load far pointer using DS, **les** – load far pointer using ES, **lfs** – load far pointer using FS, **lgs** – load far pointer using GS, and **lss** – load far pointer using SS.  
-The following example shows loading far pointer in 16-bit mode.+The following example shows how to load a far pointer in 16-bit mode.
 <code asm> <code asm>
 ; Load far pointer to DS:BX ; Load far pointer to DS:BX
Line 481: Line 481:
 ===== Miscellaneous instructions ===== ===== Miscellaneous instructions =====
 ==== No operation ==== ==== No operation ====
-The **nop** instruction performs no operation. The only result is incrementaion of the instruction pointer. In real, it is an alias to the instruction **xchg eax, eax**.+The **nop** instruction performs no operation. The only result is incrementaion of the instruction pointer. In reality, it is an alias to the instruction **xchg eax, eax**.
 <code asm> <code asm>
 nop             ;encoded as 0x90 nop             ;encoded as 0x90
Line 488: Line 488:
  
 ==== Load effective address ==== ==== Load effective address ====
-The **lea** instruction calculates the effective address as the result of the proper address expression and stores the result in a destination operand. We can store the effective address in a single register to avoid complex address calculation inside a loop, like in the following example.+The **lea** instruction calculates the effective address as the result of the proper address expression and stores the result in a destination operand. We can store the effective address in a single register to avoid complex address calculations within a loop, as in the following example.
 <code asm> <code asm>
 ; Load effective address to BX ; Load effective address to BX
Line 503: Line 503:
 </code> </code>
 <note> <note>
-Because the **lea** instruction adds source arguments, it is sometimes used instead of the **add** instruction.+Because the **lea** instruction loads the source operand into the destination register, it is sometimes used instead of the **add** instruction.
 </note> </note>
  
 ==== Undefined instructions ==== ==== Undefined instructions ====
-The undefined instructions can be used to test the behaviour of the system software in case of the appearance of an unknown opcode in the instruction stream. The **ud** and **ud1** instructions can have a source operand (register or memory address) and a destination operand (register). Operands are not used. The **ud2** instruction does not have an operand. Executing any undefined instruction results in an invalid opcode exception (#UD) throw.+The undefined instructions can be used to test the behaviour of the system software in the event of an unknown opcode appearing in the instruction stream. The **ud** and **ud1** instructions can have a source operand (register or memory address) and a destination operand (register). Operands are not used. The **ud2** instruction has no operand. Executing any undefined instruction results in an invalid opcode exception (#UD) throw.
  
 ==== Table lookup ==== ==== Table lookup ====
-The **xlatb** instruction copies the byte from a table into the AL register. The byte is addressed as the sum of the BX/EX/RBX and AL registers. There is also an **xlat** version, which enables specifying the address in the memory as the argument. It can be somewhat misleading because the argument is never used by the processor. This instruction can be used to implement the conversion from a 4-digit binary value into a hexadecimal digit, as in the following code.+The **xlatb** instruction copies the byte from a table into the AL register. The byte is addressed as the sum of the BX/EX/RBX and AL registers. There is also an **xlat** version, which allows specifying the memory address as the argument. It can be somewhat misleading because the argument is never used by the processor. This instruction can be used to convert a 4-digit binary value to a hexadecimal digit, as shown in the following code.
  
 <code asm> <code asm>
Line 524: Line 524:
 </code> </code>
 ==== Processor identification ==== ==== Processor identification ====
-The **cpuid** instruction provides processor identification information. It operates similarly to the function, with the input value sent via an accumulator (EAX). Depending on the EAX value gives different information about the processor. The requested information is returned in processor registers. For example, if EAX is zero, it returns the vendor information string"GenuineIntel" for Intel processors"AuthenticAMD" for AMD models in ECX, EDX and EBX registers. It is shown in figure {{ref>cpuid_vendor}}.+The **cpuid** instruction provides processor identification information. It operates similarly to the function, with the input value sent via an accumulator (EAX). Depending on the EAX valuethe processor provides different information. The requested information is returned in processor registers. For example, if EAX is zero, it returns the vendor information string "GenuineIntel" for Intel processors and "AuthenticAMD" for AMD models in the ECX, EDXand EBX registers. It is shown in figure {{ref>cpuid_vendor}}.
  
 <figure cpuid_vendor> <figure cpuid_vendor>
Line 535: Line 535:
  
 ==== Cache manipulating instructions ==== ==== Cache manipulating instructions ====
-Cache memory is managed by the processor, and usually, its decisions keep the performance of software execution at a good level. However, the processor offers instructions that allow the programmer to send hints to the cache management mechanism and prefetch data in advance of using it (**prefetchw**, **prefetchwt1**) and to synchronise the cache and memory and flush the cache line to make it available for other data (**clflush**, **clflushopt**). There are also additional instructions implemented for cache management introduced together with multimedia and vector extensions.+Cache memory is managed by the processor, and its decisions usually keep software execution performance at a good level. However, the processor offers instructions that allow the programmer to send hints to the cache management mechanism and prefetch data in advance of using it (**prefetchw**, **prefetchwt1**) and to synchronise the cache and memory and flush the cache line to make it available for other data (**clflush**, **clflushopt**). There are also additional instructions for cache managementintroduced together with the multimedia and vector extensions.
 ===== User Mode Extended State Save/Restore Instructions ===== ===== User Mode Extended State Save/Restore Instructions =====
-Some instructions allow for saving and restoring the state of several units of the processor. They are intended to help processors in fast context switching between processes and to be used instead of saving each register separately at the beginning of a subroutine and restoring it at the end. The content of registers is stored in memory pointed by EDX:EAX registers. Instructions for saving the state are **xsave**, **xsavec**, and **xsaveopt**. Instructions for restoring the state are **xrstor** and **xgetbv**.+Some instructions allow saving and restoring the state of several processor units. They are intended to help processors with fast context switching between processes and to replace saving each register separately at the beginning of a subroutine and restoring it at the end. The content of registers is stored in memory pointed to by the EDX:EAX registers. Instructions for saving the state are **xsave**, **xsavec**, and **xsaveopt**. Instructions for restoring the state are **xrstor** and **xgetbv**.
  
 ===== Random Number Generator Instructions ===== ===== Random Number Generator Instructions =====
Line 543: Line 543:
  
 ===== BMI1 and BMI2 Instructions ===== ===== BMI1 and BMI2 Instructions =====
-The abbreviation BMI comes from Bit Manipulation Instructions. These instructions are designed for some specific manipulation of bits in the arguments, enabling programmers to use a single instruction instead of a few.+The abbreviation BMI comes from Bit Manipulation Instructions. These instructions are designed for specific bit manipulation in the arguments, enabling programmers to use a single instruction instead of several.
 The **andn** instruction extends the group of logical instructions. It performs a bitwise AND of the first source operand with the inverted second source operand. The **andn** instruction extends the group of logical instructions. It performs a bitwise AND of the first source operand with the inverted second source operand.
-There are additional shift and rotate instructions that do not affect flags, which allows for more predictable execution without dependency on flag changes from previous operations. +There are additional shift and rotate instructions that do not affect flags, allowing for more predictable execution without relying on flag changes from previous operations. 
 . These instructions are **rorx** - rotate right, **sarx** - shift arithmetic right, **shlx** - shift logic left, and **shrx** - shift logic right. . These instructions are **rorx** - rotate right, **sarx** - shift arithmetic right, **shlx** - shift logic left, and **shrx** - shift logic right.
 Also, unsigned multiplication without affecting flags, **mulx**, was introduced.  Also, unsigned multiplication without affecting flags, **mulx**, was introduced. 
Line 593: Line 593:
 </figure> </figure>
  
-The **pext** instruction performs parallel extraction of bits using a mask. Its behaviour is shown in figure {{ref>pext_instr}}.+The **pext** instruction performs parallel bit extraction using a mask. Its behaviour is shown in figure {{ref>pext_instr}}.
  
 <figure pext_instr> <figure pext_instr>
en/multiasm/papc/chapter_6_7.txt · Last modified: by pczekalski
CC Attribution-Share Alike 4.0 International
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0