| Both sides previous revisionPrevious revisionNext revision | Previous revision |
| en:multiasm:papc:chapter_6_9 [2026/01/09 13:16] – [Compatibility with HLL Compilers (C++, C#) and Operating Systems] pczekalski | en:multiasm:papc:chapter_6_9 [2026/02/27 02:50] (current) – [Merging of the High-Level Languages and Assembler Code] jtokarz |
|---|
| |
| <figure staticlinking> | <figure staticlinking> |
| {{ :en:multiasm:papc:static_linking.png?400 |Static merging (linking) of the assembler code and high-level application}} | {{ :en:multiasm:papc:static_linking.png?600 |Static merging (linking) of the assembler code and high-level application}} |
| <caption>Static merging (linking) of the assembler code and high-level application</caption> | <caption>Static merging (linking) of the assembler code and high-level application</caption> |
| </figure> | </figure> |
| <figure staticlinking> | <figure dynamiclinking> |
| {{ :en:multiasm:papc:dynamic_linking.png?400 |Dynamic merging (loading) of the assembler code and high-level application}} | {{ :en:multiasm:papc:dynamic_linking.png?600 |Dynamic merging (loading) of the assembler code and high-level application}} |
| <caption>Dynamic merging (loading) of the assembler code and high-level application</caption> | <caption>Dynamic merging (loading) of the assembler code and high-level application</caption> |
| </figure> | </figure> |
| Dynamic loading of code is considered an advantage because the original application does not contain the assembler binary executable; it is kept in a separate file and loaded on demand, so it can be compiled and exchanged independently. On the other hand, it raises a number of challenges, such as versioning, compatibility, and the time required to load the library from the file system before the first call to the contents. | Dynamic code loading is considered an advantage because the original application does not contain the assembler binary executable; it is kept in a separate file and loaded on demand, allowing it to be compiled and exchanged independently. On the other hand, it raises several challenges, such as versioning, compatibility, and the time required to load the library from the file system before the first call to its contents. |
| ===== Programming in Assembler for Windows ===== | ===== Programming in Assembler for Windows ===== |
| Windows OS has historically supported unmanaged code written primarily in C++. This kind of code runs directly on the CPU, but divergence in hardware platforms, such as the introduction of ARM-core-based platforms running Windows, causes incompatibility issues. Since the introduction of the .NET framework, Windows has provided developers with a safer way to execute their code, called "managed code". The difference is that managed code, typically written in C#, is executed by a .NET framework interpreter rather than being compiled into machine code, as unmanaged code is. The use of managed code brings multiple advantages for developers, including automated memory management and code isolation from the operating system. This, however, raises several challenges when integrating managed code and assembly code. In any case, the integration model is common: the assembler implements functions (usually stateless) that are later called from the high-level language and return data to it (figure {{ref>masmintegration1}}). | Windows OS has historically supported unmanaged code written primarily in C++. This kind of code runs directly on the CPU, but divergence in hardware platforms, such as the introduction of ARM-core-based platforms running Windows, causes incompatibility issues. Since the introduction of the .NET framework, Windows has provided developers with a safer way to execute their code, called "managed code". The difference is that managed code, typically written in C#, is executed by a .NET framework interpreter rather than being compiled into machine code, as unmanaged code is. The use of managed code brings multiple advantages for developers, including automated memory management and code isolation from the operating system. This, however, raises several challenges when integrating managed code and assembly code. In any case, the integration model is common: the assembler implements functions (usually stateless) that are later called from the high-level language and return data to it (figure {{ref>masmintegration1}}). |
| |
| ==== Dynamic memory management considerations ==== | ==== Dynamic memory management considerations ==== |
| Using dynamic memory management at the level of the assembler code is troublesome: allocating and releasing memory require calls to the hosting operating system. It is possible, but complex. Moreover, there is no dynamic, automated memory management, as in .NET, Java, and Python, so the developer is on their own, similar to programming in C++. For this reason, it is common to allocate adequate memory resources on the high-level code, e.g., the GUI front-end and pass them to the assembler code as pointers. Note, however, that for some higher-level languages, such as C#, it is necessary to follow a strict pattern to ensure correct and persistent memory allocation, as described in the following sections. | Using dynamic memory management at the assembler level is troublesome: allocating and releasing memory require calls to the host operating system. It is possible, but complex. Moreover, there is no dynamic, automated memory management, as in .NET, Java, and Python, so the developer is on their own, much like in C++. For this reason, it is common to allocate adequate memory resources on the high-level code, e.g., the GUI front-end and pass them to the assembler code as pointers (figure {{ref>dynamicmemory}}). Note, however, that for some higher-level languages, such as C#, it is necessary to follow a strict pattern to ensure correct and persistent memory allocation, as described in the following sections. |
| |
| <note tip>Using dynamic memory management at the level of the assembler code is troublesome. Common practice is to dynamically allocate memory resources in the scope of the calling (high-level) application and pass them to the assembler code via pointers.</note> | <note tip>Using dynamic memory management at the level of the assembler code is troublesome. Common practice is to dynamically allocate memory resources in the scope of the calling (high-level) application and pass them to the assembler code via pointers.</note> |
| |
| | <figure dynamicmemory> |
| | {{ :en:multiasm:papc:hll_and_assembler-dynamic_memory_allocation.drawio.png?600 | Dynamic Memory Allocation Model for Assembler Code Integration}} |
| | <caption>Dynamic Memory Allocation Model for Assembler Code Integration</caption> |
| | </figure> |
| ==== Pure Assembler Applications for Windows CMD ==== | ==== Pure Assembler Applications for Windows CMD ==== |
| It is possible to write an application for Windows solely in assembler. While the reason to do it is doubtful, some hints presented below, such as calling system functions, may be helpful. | It is possible to write an application for Windows solely in assembler. While the reason to do it is doubtful, some hints presented below, such as calling system functions, may be helpful. |
| int main() | int main() |
| { | { |
| dllHandle = LoadLibrary(TEXT("AssemblerDll.dll")); | dllHandle = LoadLibrary(TEXT("AssemblerDll.dll")); |
| if (!dllHandle) | if (!dllHandle) |
| { | { |
| std::cerr << "Failed to load DLL library\n"; | std::cerr << "Failed to load DLL library\n"; |
| return 1; | return 1; |
| } | } |
| MyProc myAsmProcedure = (MyProc)GetProcAddress(dllHandle, "MyAsmProc"); | MyProc myAsmProcedure = (MyProc)GetProcAddress(dllHandle, "MyAsmProc"); |
| if (!myAsmProcedure) | if (!myAsmProcedure) |
| { | { |
| std::cerr << "Failed to find assembler procedure\n"; | std::cerr << "Failed to find assembler procedure\n"; |
| FreeLibrary(dllHandle); | FreeLibrary(dllHandle); |
| return 2; | return 2; |
| } | } |
| std::cout << myAsmProcedure(); | std::cout << myAsmProcedure(); |
| FreeLibrary(dllHandle); | FreeLibrary(dllHandle); |
| return 0; | return 0; |
| } | } |
| |
| |
| ===== Programming in Assembler for Linux ===== | ===== Programming in Assembler for Linux ===== |
| | Principles for composing assembler code and high-level language into a single application on Linux OSes are similar to those on Windows; dynamic loading is more complex. Thus, we consider only static linking of the code. The most common use of C++ is as a high-level application. Still other options are possible, such as Python. |
| | |
| | Linux provides more parameters passed via registers in its x64 standard calls (up to 6) than Windows (only up to 4). Refer to the chapter [[en:multiasm:papc:chapter_6_8|]] for details. |
| | |
| | A common scenario is to use the [[https://man7.org/linux/man-pages/man1/g++.1.html|g++]] compiler to compile high-level applications and [[https://www.nasm.us/|nasm]] to compile assembler code. It is also common to help compose a heterogeneous project using makefiles, as presented below. |
| | |
| | The sample project is composed of the ''main.cpp'' file (main file with high-level, C++ application), ''asmfunc.asm'' containing the assembler source code and the aforementioned ''Makefile''. |
| | |
| | The ''Makefile'' contains definitions of the compilation and linking of the **main** application and also a definition of the cleanup commands ("clean" section): |
| | <code ini Makefile> |
| | all: main |
| | |
| | main: main.o asmfunc.o |
| | g++ -o main main.o asmfunc.o |
| | |
| | main.o: main.cpp |
| | g++ -c -g -F dwarf main.cpp |
| | |
| | asmfunc.o: asmfunc.asm |
| | nasm -g -f elf64 -F dwarf asmfunc.asm -l asmfunc.lst |
| | |
| | clean: |
| | rm -f ./main || true |
| | rm -f ./main.o || true |
| | rm -f ./asmfunc.o || true |
| | rm -f ./asmfunc.lst || true |
| | </code> |
| | |
| | <note important>It is essential to remember that in Linux OSes, indentation whitespaces in ''Makefile'' must be created using TABs rather than SPACEs.</note> |
| | |
| | Assembler code exposes functions to the linker using the ''global'' directive. Without it, assembler functions remain "private" and cannot be called, so linking won't succeed if there is a reference to the function from the high-level language part of the code. The following code presents a dummy function that performs integer addition of two arguments. Directives "section" are optional in this example. |
| | |
| | <code assembler asmfunc.asm> |
| | section .data |
| | section .bss |
| | section .text |
| | |
| | global addInAsm |
| | |
| | addInAsm: |
| | nop |
| | mov rax, rsi |
| | add rax, rdi |
| | ret |
| | </code> |
| | |
| | Finally, the calling side (C++ application) uses the ''extern'' directive to inform the linker about the external function, written in assembler. |
| | <code cpp main.cpp> |
| | #include <iostream> |
| | |
| | extern "C" {long long int addInAsm(long long, long long);} |
| | |
| | long long a=10; |
| | long long b=7; |
| | long long returnValue; |
| | |
| | int main() { |
| | std::cout << "Hello, Assembler!" << std::endl; |
| | returnValue = addInAsm(a,b); |
| | std::cout << "Sum of " << a << " and " << b << " is " << returnValue |
| | << std::endl; |
| | return 0; |
| | } |
| | </code> |