Differences

This shows you the differences between two versions of the page.

--- en:multiasm:exercisebook:pc:sut:scenarios_standalone [2026/05/20 14:59] – created ktokarz
+++ en:multiasm:exercisebook:pc:sut:scenarios_standalone [2026/05/20 15:57] (current) – [Implementation of calculation functions] ktokarz
@@ Line 1: / Line 1: @@
+====== Scenarios ======
+===== Converting integers to hexadecimal text =====
+In our first scenario, we will modify the conversion library, adding another function which should convert integer input into a hexadecimal representation. We can copy the int_to_ascii function and introduce some simple modifications. First, we need to divide the input value by 16, not by 10.
+<code asm>
+   mov rbx, 16
+</code>
+After each division operation, we will obtain the remainder from the range 0-15. We can't convert this into an ASCII digit the same way as in decimal, because the digits 0-9 and letters A-F do not form a continuous range. We can deal with this situation in different ways. One approach is to check if dl is bigger than 9 and shift it to point to letter characters if true.
+<code asm>
+    cmp dl, 9          ; test if dl > 9
+    jna zero_to_nine   ; if not jump over adjustment
+    add dl, "A"-"9"-1  ; adjust dl with the distance between A and 9
+zero_to_nine:
+    add dl, "0"        ; convert to ASCII
+</code>
+Another approach is to define the table of characters (lookup table) in the data section containing all digits and letters, and pick the correct character using the **xlatb** instruction or the **mov** with proper indirect addressing mode.
+<code asm>
+.data
+hex_digits db "0123456789ABCDEF"
+.code
+...
+    lea rcx, hex_digits        ; load address of lookup table
+    and rdx, 0000000Fh         ; limit the range to 15
+    mov byte ptr dl, [rcx+rdx] ; convert remainder into ASCII
+...
+</code>
+In the second approach, we use indirect addressing with the use of the sum of the rcx and rdx registers. The base address of a table must be loaded to rcx with the use of the **lea** instruction, not used as a constant. This is because an instruction we could use in 32-bit mode:
+<code asm>
+    mov byte ptr dl, hex_digits[rdx]   ; This instruction is NOT VALID in 64-bit mode
+</code>
+used in 64-bit long mode will signal an error. The address of the lookup table is a 64-bit number, but the constant encoded in the used form of the **mov** instruction can't exceed 32 bits.
+To use the mentioned **xlatb** instruction, we have to preserve the rax before conversion. We will do it by storing it temporarily in rcx. We need to handle the rbx in a different way. In each iteration, set it to 16 before dividing, and to the lookup table address before **xlatb**.
+<code asm>
+.code
+...
+    mov rbx, 16          ; prepare divisor
+    div rbx		 ; rax / 16 → remainder in rdx
+    mov rcx, rax         ; store temporarily rax
+    lea rbx, hex_digits  ; load address of lookup table
+    and rdx, 0000000Fh   ; limit the range to 15
+    mov al, dl           ; prepare index in al
+    xlatb                ; convert remainder into ASCII
+    mov [rdi], al        ; put character to resulting table
+    mov rax, rcx         ; restore rax
+</code>
+To improve the performance of our code, in the case of hexadecimal numbers, it is possible to replace the time-consuming division instruction with an instruction to shift the number by four bit positions right. We leave the implementation of this optimisation to the reader.
+===== Converting floating point values to text =====
+As the second scenario, we will add to our library a function for displaying floating-point values. This function will allow us to display the results of calculations we implement in further scenarios. According to x64 Windows ABI rules, floating-point values should be passed through XMM registers. We will display a single value, so we'll use the XMM0 register.
+Displaying floating-point numbers is a much more complex task than displaying an integer. We will split it into a conversion of the fractional part and a conversion of the integer part. First, we'll store the argument in XMM0 into XMM1 to have the original value unchanged.
+Let's start with a check to see if the value is positive or negative. Floating-point numbers are stored as absolute values, with the sign bit in the most significant position. The encoding scheme for positive and negative numbers with the same absolute value differs only in the sign bit. To test whether a number is negative, we can use the **movmskps** instruction, which copies the sign bits from all elements of a vector into the destination register. As our argument is a scalar, the bit we're interested in is at the lowest position. Shifting the register one position to the right, we can execute a conditional jump. If the argument is negative, we'll change it into positive by clearing a sign bit. The **andps** instruction with **clear_sign_bit** variable clears one bit in the XMM1 register.
+<code asm>
+.data
+clear_sign_bit dword 07FFFFFFFh, 0FFFFFFFFh, 0FFFFFFFFh, 0FFFFFFFFh
+...
+.code
+...
+; test if the number is positive or negative
+    movq xmm1, xmm0
+    movmskps rax, xmm1
+    rcr rax, 1
+    jnc float_positive
+; change the sign of the scalar
+    andps xmm1, xmmword ptr clear_sign_bit
+; do not change the sign
+float_positive:
+</code>
+We will start the conversion from the least significant digit of the fractional part, limiting precision to thousandths. We obtain the fractional part by subtracting the integer part from the original argument. An integer is obtained with the **cvttss2si** instruction, which simply cuts out the fractional part of a number. We store the result in rcx for further use.
+<code asm>
+.data
+const1000 real4 1000.0
+.code
+...
+; convert fractional part
+    cvttss2si rax, xmm1 ; convert float to int with truncation
+    mov rcx, rax        ; store for conversion of an integer part
+    cvtsi2ss xmm2, rax  ; convert back into float
+    subss xmm1, xmm2    ; subtract integer part
+    mulss xmm1, const1000 ; we want three fractional digits
+    cvttss2si rax, xmm1
+    mov rbx, 10
+convert_fraction:
+    dec rdi		; starting from the end of the text (least significant)
+    xor rdx, rdx	; prepare to divide rdx:rax by rbx
+    div rbx		; rax / 10 → remainder in rdx
+    add dl, "0"		; convert remainder into ASCII
+    mov [rdi], dl	; write character to buffer
+    test rax, rax	; test if there is still a value for conversion
+    jne convert_fraction
+</code>
+We separate the fractional and integer parts with a dot.
+<code asm>
+; add dot
+    dec rdi
+    mov byte ptr [rdi], '.'
+</code>
+The integer part is converted with the same algorithm as the fractional, but before we restore its value from rcx.
+<code asm>
+; convert integer part
+    mov rax, rcx        ; restore integer part
+convert_integer:
+    dec rdi		; starting from the end of the text (least significant)
+    xor rdx, rdx	; prepare to divide rdx:rax by rbx
+    div rbx	        ; rax / 10 → remainder in rdx
+    add dl, "0"		; convert remainder into ASCII
+    mov [rdi], dl	; write character to buffer
+    test rax, rax	; test if there is still a value for conversion
+    jne convert_integer
+</code>
+After converting the integer part, we need to add "minus" for a negative value. We'll test it again with the same method as at the beginning. For this purpose, we kept the original argument in XMM0.
+<code asm>
+; test if the number is positive or negative
+    movmskps rax, xmm0
+    rcr rax, 1
+    jnc end_float
+; add minus if needed
+    dec rdi             ; add minus character
+    mov byte ptr [rdi], '-'
+end_float:
+</code>
+The final part, calculating the string length, is the same as in the conversion of integers.
 ===== Implementation of calculation functions =====
 In another scenario, we will create a library with functions performing the simple calculations on integers and floating-point numbers. We will write functions for adding six integers and six floating-point values. This example will present argument passing through the registers and also through the stack, showing the order and addresses of stack-allocated arguments.
@@ Line 35: / Line 183: @@
 sum_6_int endp
 </code>
-The stack from a function perspective looks like in a fig.
+The stack from a function perspective looks like in a fig.{{ref>ex_stack_simple}}.
+<figure ex_stack_simple>
+{{ :en:multiasm:exercisebook:pc:ex_stack_simple.png?400 |Stack view inside the function with 6 arguments}}
+<caption>Stack view inside the function with 6 arguments</caption>
+</figure>
+The caller passes arguments according to the Windows x64 ABI. First four arguments through RCX, RDX, R8 and R9 registers. Further arguments are placed onto the stack. The caller is also responsible for reserving the shadow space for all arguments before the call, even those passed through registers. That's why 32 bytes are reserved before the return address is automatically placed on the stack by the call instruction.
+Please note the order of arguments. It is assumed that they are placed onto the stack in reverse order. The last argument is placed on the stack first. That's why the 6th argument is at the higher address, next is the 5th argument and next there is a shadow space for arguments 1 - 4. From the perspective of a function, the first argument (or rather its shadow) is just after the return address. As the return address consumes 8 bytes, the shadow space for the first argument is at address SP+8.
+How to call such a function? Putting the first four parameters into registers is quite simple. To place remaining arguments onto the stack, it is possible to use the **push** instruction.
+<code asm>
+;call sum of 6 integers function
+    mov rcx, 1     ; 1st argmument
+    mov rdx, 2     ; 2nd argmument
+    mov r8, 3      ; 3rd argmument
+    mov r9, 4      ; 4th argmument
+    mov r11, 6
+    push r11       ; 6th argument
+    mov r10, 5
+    push r10       ; 5th argument
+    sub rsp, 20h   ; 32 bytes of the shadow space
+    call sum_6_int ; function call
+    add rsp, 30h   ; stack cleanup
+    mov rcx, rax   ; result in rax
+</code>
+The figure {{ref>ex_stack_caller_push}} shows the stack organisation from the caller's perspective. First, the 6th argument is pushed onto the stack. Next, the 5th argument is pushed. Next, the 32 bytes of the shadow space are reserved with the subtraction instruction **sub rsp, 20h**. Finally, the return address is pushed by the **call** instruction. The arrows point to the addresses (where RSP points) after the specified instructions.
+<figure ex_stack_caller_push>
+{{ :en:multiasm:exercisebook:pc:ex_stack_caller_push.png?400 |Stack view from caller function}}
+<caption>Stack view from caller function</caption>
+</figure>
-The caller passes arguments according to the Windows x64 ABI. The responsibility of the caller is also to reserve the shadow space for all arguments before the call.

en/multiasm/exercisebook/pc/sut/scenarios_standalone.1779278347.txt.gz · Last modified: 2026/05/20 14:59 by ktokarz