Table of Contents

Programming in Assembler for x64

In this section, we will show some examples of programs written purely in assembler or in connection with other programming languages, including C++ and C#. We assume that the reader is familiar with the coursebook, instructions and directives used to write the assembler programs. We will describe the use of the integrated development environment (Visual Studio) and methods to assemble programs with the command line only.We will also show how to create the static and dynamic library written in assembler for use in assembler or in other compilers.

Introduction to the x64 Assembler programming in MASM - Microsoft Visual Studio Community Edition

In the following chapter, we explain how to write, assemble, link and execute programs written in assembly language for x64 processors. We assume that the reader is familiar with the most important processor instructions and MASM directives.

Creating a project in VS with MASM source. Assembling, debugging, disassembly window, register view, memory view - data section,

[piotr] TO BE DONE

Standalone assembly

It is possible to use command-line MASM tools to assemble, link, and create libraries written in assembly language. You can use any editor to create the assembler source code and translate it into machine code. The tools required are integral elements of the Visual Studio Community installation, installed with the option “Desktop development with C++”. For the default VS installation, you can find them in the following folder (it can change due to different version numbers).

C:\Program Files\Microsoft Visual Studio\18\Community\VS\Tools\MSVC\14.50.35717\bin\Hostx64\x64

To use statically included Windows libraries, you need lib files. The essential library is kernel32.lib, but for other Windows functions, you will also need some additional libraries. All are available in the following folder (it can change due to different version numbers).

C:\Program FIles (x86)\Windows Kits\10\Lib\10.0.26100.0\um\x64

For assembling the source file, the ML64.exe program is used. This program has many options, which you can see executing:

ML64.exe /?

After assembling, ML64 can call the linker automatically. An exemplary MASM execution command to assemble and link the file named source.asm can look like this:

ml64 /Fl /Zi /Zd source.asm /link /entry:main

The options used explanation:

If you prefer another name than “main” as the entry point for your console program, you will need to specify the type of the system for the resulting code. For a console application, you need to add /SUBSYSTEM:CONSOLE.
The easiest way is to put all required files in the same folder on the disk. This is not the case for more complex projects, so file names should be preceded by their full paths.

It will not be very surprising that the first code example will be the “Hello world!”. This program uses three system functions:

The functions are implemented in a library file kernel32.lib, which is statically linked. We use the “includelib” directive to inform the linker where to search for functions. To inform the assembler about the names of functions, we declare them with the set of “extern” directives. The details of each statement of the program are explained in comments.

option casemap:none             ; recognising small and capital letters
 
includelib kernel32.lib         ; statically linked library with system functions
 
EXTERN GetStdHandle:PROC        ; declaration of system functions for use
EXTERN WriteConsoleA:PROC
EXTERN ExitProcess:PROC
 
STD_OUTPUT_HANDLE equ -11       ; STD_OUTPUT_HANDLE costant
 
; In the data section of our program, there is a string to be displayed
.data
    message db "Hello, World!", 13, 10
    msgLen  equ $ - message     ; constant calculation containing string length
 
; In the code section of our program, there are instructions for execution
.code
main PROC                       ; main function - entry point
    sub rsp, 28h                ; shadow space + align
 
; HANDLE hConsole = GetStdHandle(STD_OUTPUT_HANDLE)
    mov ecx, STD_OUTPUT_HANDLE
    call GetStdHandle           ; this function returns the handle of the console window
 
; WriteConsoleA(hConsole, message, msgLen, &written, NULL)
    mov rcx, rax                ; console window handle
    lea rdx, message            ; pointer to the buffer
    mov r8d, msgLen             ; length
    lea r9, written             ; pointer to a var with a real number of chars written
    mov qword ptr [rsp+20h], 0  ; 5th argument (lpReserved = NULL)
    call WriteConsoleA          ; this function displays text in the console
 
; ExitProcess(0)
    xor ecx, ecx                ; value to be returned
    call ExitProcess            ; return to operating system
main ENDP                       ; end of the main function
 
; In the uninitialised data section of our program, there is a "written" variable
.data?
    written dq ?                ; variable which holds the number of written chars
 
END                             ; end of source file

Creating static libraries

To create the static library, the assembler module shouldn't have the main procedure defined. All other procedures will be made available for other programs by default. If there is a need to hide a procedure from visibility, it is possible to mark it as PRIVATE. The first step is to assemble the source file with MASM.

ml64 /c source.asm

The second step is to create the lib file with the lib tool.

lib source.obj

This will create the source.lib file, which can be imported into the program, where we can use all available procedures.

The example for the library will be the program containing the function “int_to_ascii”, which converts the integer number into a text representation. Let's begin with the function itself. The function accepts two arguments: the number to be converted passed by RCX and the pointer to the buffer for the resulting text passed by RDX. It converts a signed 64-bit number and returns the updated pointer in RDX and the length of the resulting string in RAX. We can use the results in the WinAPI function WriteConsoleA to display the ASCII representation of a number in the console.

Please note that the code of the library module does not have the “main” function, which in an executable program file serves as an entry point.
option casemap:none
 
.code
; ----------------------------------------
; int_to_ascii
; input:   RCX = signed 64-bit number
; output:  updated string at address in RDX
;          RAX = length of the resulting string
; ----------------------------------------
int_to_ascii PROC
    push rbx              ; rbx is nonvolatile
    push rdi              ; rdi is nonvolatile
    sub rsp, 24           ; shadow space
    mov [rsp+8],  rcx
    mov [rsp+16], rdx
    mov rax, rcx          ; mov imput number to rax
 
; point rdi into the buffer end
    mov rdi, rdx          ; pointer to a string
    add rdi, 31
    mov byte ptr [rdi], 0 ; mark string end with terminator
 
    mov rbx, 10
 
; test if the numer is positive or negative
    xor r8d, r8d	  ; r8 = 0 → positive flag
    test rax, rax	  ; test the sign
    jge convert		  ; jump if rax positive
 
    neg rax	          ; change the sign of rax
    mov r8d, 1		  ; r8 = 1 → negative flag
 
; conversion loop
convert:
    dec rdi		  ; starting from the end of the text (least significant digit)
    xor rdx, rdx	  ; prepare to divide rdx:rax by rbx
 
    div rbx		  ; rax / 10 → remainder in rdx
    add dl, "0"		  ; convert remainder into ASCII
    mov [rdi], dl	  ; write character of a digit to buffer
    test rax, rax	  ; test if there is still a value for conversion
    jne convert
 
; add minus if needed
    cmp r8d, 0            ; r8 = 1 → negative flag
    je write
    dec rdi               ; add minus character
    mov byte ptr [rdi], '-'
 
write:
; calculate length of the text (end – rdi)
    mov rax, [rsp+16]     ; get pointer to an original buffer
    add rax, 31
    sub rax, rdi          ; resulting number length in rax
    mov rdx, rdi          ; adjusted pointer to string in a buffer
 
    add rsp, 24           ; restore stack pointer 
    pop rdi
    pop rbx
    ret
int_to_ascii ENDP
 
END

This library can be imported into the assembler program or a program written in another programming language. Assembly program can look as follows:

option casemap:none
 
; include the system library and our convert library
includelib kernel32.lib
includelib convert.lib
 
; declare function we use in our program
EXTERN GetStdHandle:PROC
EXTERN WriteConsoleA:PROC
EXTERN ExitProcess:PROC
EXTERN int_to_ascii:PROC
 
; costant required by GetStdHandle system function
STD_OUTPUT_HANDLE equ -11
 
; data section
.data
    buffer db 32 dup(0) ; buffer for a string
    hOut   dq ?         ; placeholder for console handle
    dummy  dq ?         ; place for dummy parameter
 
;code section
.code
 
; -------------------------------------------
; main function of the program - entry point
; -------------------------------------------
main PROC
; shadow space
    sub rsp, 40
 
; get the handle of stdout
    mov ecx, STD_OUTPUT_HANDLE ; console output
    call GetStdHandle
    mov hOut, rax       ; store the handle
 
; call conversion function
    mov rcx, 33550336   ; number for displaying
    lea rdx, buffer     ; pointer to a buffer
    call int_to_ascii
 
; prepare agruments for WriteConsoleA(hOut, rdi, len, ...)
    mov rcx, hOut       ; console handle 
                        ; pointer to the beginning of a string is in rdx
    mov r8, rax         ; nNumberOfCharsToWrite is in rax  
    lea r9, dummy       ; dummy for lpNumberOfCharsWritten
    mov qword ptr [rsp+20h], 0  ; lpReserved (must be NULL)
 
    call WriteConsoleA  ; displaying function
 
    xor ecx, ecx        ; return value of a program
    call ExitProcess    ; go back to Windows OS
main ENDP
 
END

Introduction to Linux assembly programming

NASM

Scenarios

Displaying integers in hex

In our first scenario, we will modify the conversion library, adding another function which should convert integer input into a hexadecimal representation. We can copy the int_to_ascii function and introduce some simple modifications. First, we need to divide the input value by 16, not by 10.

   mov rbx, 16

After each division operation, we will obtain the remainder from the range 0-15. We can't convert this into an ASCII digit the same way as in decimal, because the digits 0-9 and letters A-F do not form a continuous range. We can deal with this situation in different ways. One approach is to check if dl is bigger than 9 and shift it to point to letter characters if true.

    cmp dl, 9          ; test if dl > 9
    jna zero_to_nine   ; if not jump over adjustment
    add dl, "A"-"9"-1  ; adjust dl with the distance between A and 9
zero_to_nine:
    add dl, "0"        ; convert to ASCII

Another approach is to define the table of characters (lookup table) in the data section containing all digits and letters, and pick the correct character using the xlatb instruction or the mov with proper indirect addressing mode.

.data
hex_digits db "0123456789ABCDEF"
 
.code
...
    lea rcx, hex_digits        ; load address of lookup table
    and rdx, 0000000Fh         ; limit the range to 15
    mov byte ptr dl, [rcx+rdx] ; convert remainder into ASCII
...

In the second approach, we use indirect addressing with the use of the sum of the rcx and rdx registers. The base address of a table must be loaded to rcx with the use of the lea instruction, not used as a constant. This is because an instruction we could use in 32-bit mode:

    mov byte ptr dl, hex_digits[rdx]

used in 64-bit long mode will signal an error. The address of the lookup table is a 64-bit number, but the constant encoded in the used form of the mov instruction can't exceed 32 bits.

To use the mentioned xlatb instruction, we have to preserve the rax before conversion. We will do it by storing it temporarily in rcx. We need to handle the rbx in a different way. In each iteration, set it to 16 before dividing, and to the lookup table address before xlatb.

.code
...
    mov rbx, 16          ; prepare divisor
    div rbx		 ; rax / 16 → remainder in rdx
    mov rcx, rax         ; store temporarily rax
    lea rbx, hex_digits  ; load address of lookup table
    and rdx, 0000000Fh   ; limit the range to 15
    mov al, dl           ; prepare index in al
    xlatb                ; convert remainder into ASCII
    mov [rdi], al        ; put character to resulting table
    mov rax, rcx         ; restore rax

To improve the performance of our code, in the case of hexadecimal numbers, it is possible to replace the time-consuming division instruction with an instruction to shift the number by four bit positions left. We leave the implementation of this optimisation to the reader.

Implementation of calculation functions

As the second scenario, we will create another library with functions performing the simple calculations on integer and floating