This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| en:multiasm:paarm:chapter_5_3 [2025/12/03 02:31] – [CPU Configuration] eriks.klavins | en:multiasm:paarm:chapter_5_3 [2026/02/27 16:04] (current) – jtokarz | ||
|---|---|---|---|
| Line 5: | Line 5: | ||
| The term ‘Rn’ refers to architectural registers, not the registers to be used in the assembler code. | The term ‘Rn’ refers to architectural registers, not the registers to be used in the assembler code. | ||
| - | {{ : | + | <figure registersizes> |
| + | {{ : | ||
| + | < | ||
| + | </ | ||
| Note that accessing the W0 or W1 register does not allow access to the remaining 32 most significant bits. Also, when the W register is written in a 32-bit register, the top 32 bits (most significant bits of the 64-bit register) are zeroed. And there are no registers named R0 or R1, so if we need to access the 64-bit register result, we need to address it with X0 or X1 (or other register up to X30), and similarly with 32-bit registers – W0, W1 and so on are used to address general-purpose registers. | Note that accessing the W0 or W1 register does not allow access to the remaining 32 most significant bits. Also, when the W register is written in a 32-bit register, the top 32 bits (most significant bits of the 64-bit register) are zeroed. And there are no registers named R0 or R1, so if we need to access the 64-bit register result, we need to address it with X0 or X1 (or other register up to X30), and similarly with 32-bit registers – W0, W1 and so on are used to address general-purpose registers. | ||
| These examples perform single 32-bit arithmetic operations: | These examples perform single 32-bit arithmetic operations: | ||
| - | {{ : | + | <figure arithmetic32bit> |
| + | {{ : | ||
| + | < | ||
| + | </ | ||
| These examples perform single 64-bit arithmetic operations: | These examples perform single 64-bit arithmetic operations: | ||
| - | {{ : | + | <figure arithmetic64bit> |
| + | {{ : | ||
| + | < | ||
| + | </ | ||
| Special registers like the Stack Pointer (SP), Link Register (LR), and Program Counter (PC) are available. The Link Register (LR) is stored in the X30 register. The Stack Pointer SP and the Program Counter PC registers are no longer available as regular general-purpose registers. It is still possible to use the SP register with a limited set of data-processing instructions via the WSP register name. Unlike ARMv7, the PC register is no longer accessible via data-processing instructions. The PC register can be read by ‘ADR’ instruction, | Special registers like the Stack Pointer (SP), Link Register (LR), and Program Counter (PC) are available. The Link Register (LR) is stored in the X30 register. The Stack Pointer SP and the Program Counter PC registers are no longer available as regular general-purpose registers. It is still possible to use the SP register with a limited set of data-processing instructions via the WSP register name. Unlike ARMv7, the PC register is no longer accessible via data-processing instructions. The PC register can be read by ‘ADR’ instruction, | ||
| Line 30: | Line 39: | ||
| Example adding two registers together with different register notation. | Example adding two registers together with different register notation. | ||
| - | {{ : | + | <figure add64_32> |
| + | {{ : | ||
| + | < | ||
| + | </ | ||
| Line 36: | Line 48: | ||
| ARMv8 has an additional 32 register set for floating-point and vector operations, like general-purpose registers. These registers are 128 bits wide and, like general-purpose registers, can be accessed in several ways. The letters for these registers identify byte (Bx), half-word (Hx), single-word (Sx), double-word (Dx) and quad-word (Qx) access. | ARMv8 has an additional 32 register set for floating-point and vector operations, like general-purpose registers. These registers are 128 bits wide and, like general-purpose registers, can be accessed in several ways. The letters for these registers identify byte (Bx), half-word (Hx), single-word (Sx), double-word (Dx) and quad-word (Qx) access. | ||
| - | {{ : | + | <figure vectorreg> |
| + | {{ : | ||
| - | {{ : | + | {{ : |
| + | < | ||
| + | </ | ||
| - | More information on these registers and operations performed with floating-point is described in the following section, “Advanced Assembly Programming”. | + | More information on these registers and operations performed with floating-point is described in the following section: [[en: |
| ===== CPU Configuration===== | ===== CPU Configuration===== | ||
| The Raspberry Pi 5 has an ARM Cortex-A76 processor with 4 CPU cores. Each core has its own stack pointers, status registers and other registers. Before looking at CPU registers, some specifics must be explained. The single core has several execution levels: EL0, EL1, EL2, and EL3. These execution levels in datasheets are called Exception Levels – the level at which the processor resources are managed. EL0 is the lowest level; all user applications are executed at this level. EL1 is meant for operating systems; EL2 is intended for a Hypervisor application to control resources for the OS and the lower exception layers. The CPU's general-purpose registers are independent of Exception levels, but it is essential to understand which Exception Level executes the code. This is called “System configuration” because the processor has multiple cores, and each core has multiple exception levels. To configure the system and access the system registers, the MRS and MSR instructions must be used. Note that the registers that have the suffix “_ELn” have a separate, banked copy in some or all of the levels, except for EL0. This suffix also defines the lowest exception level, which can access the particular system register. Only a few system registers are accessible from EL0, though the Cache Type Register (CTR_EL0) is one of them. | The Raspberry Pi 5 has an ARM Cortex-A76 processor with 4 CPU cores. Each core has its own stack pointers, status registers and other registers. Before looking at CPU registers, some specifics must be explained. The single core has several execution levels: EL0, EL1, EL2, and EL3. These execution levels in datasheets are called Exception Levels – the level at which the processor resources are managed. EL0 is the lowest level; all user applications are executed at this level. EL1 is meant for operating systems; EL2 is intended for a Hypervisor application to control resources for the OS and the lower exception layers. The CPU's general-purpose registers are independent of Exception levels, but it is essential to understand which Exception Level executes the code. This is called “System configuration” because the processor has multiple cores, and each core has multiple exception levels. To configure the system and access the system registers, the MRS and MSR instructions must be used. Note that the registers that have the suffix “_ELn” have a separate, banked copy in some or all of the levels, except for EL0. This suffix also defines the lowest exception level, which can access the particular system register. Only a few system registers are accessible from EL0, though the Cache Type Register (CTR_EL0) is one of them. | ||
| - | ''< | + | ''< |
| ''< | ''< | ||
| - | {{ : | + | <figure exceptionlevels> |
| + | {{ : | ||
| + | < | ||
| + | </ | ||
| + | |||
| + | In Fig. {{ref> | ||
| + | The Green region is a Secure State where only special secure applications and operating systems are executed. This may be used in system duplication, | ||
| We will look only at AArch64 registers to narrow the number of registers. | We will look only at AArch64 registers to narrow the number of registers. | ||
| - | There are many registers dedicated to the CPU. Specialised registers will be left aside again to narrow the amount of information, | + | There are many registers dedicated to the CPU. Specialised registers will be left aside again to narrow the amount of information, |
| <table tab_label> | <table tab_label> | ||
| < | < | ||
| - | ^ Register ^ description^ | + | ^ Register |
| - | | AFSR0_EL1..3 | + | | AFSR0_EL1..3 |
| - | | | | | + | | DBGAUTHSTATUS_EL1 |
| - | | | | | + | | DISR_EL1 |
| - | | | | | + | | DSPSR_EL0 |
| - | | | | | + | | ERXGSR_EL1 |
| - | | | | | + | | ERXSTATUS_EL1 |
| - | | | | | + | | **FPSR** |
| - | | | | | + | | ICH_EISR_EL2 |
| - | | | | | + | | ICH_ELRSR_EL2 |
| - | | | | | + | | IFSR32_EL2 |
| - | | | | | + | | ISR_EL1 |
| - | | | | | + | | MDCCSR_EL0 |
| - | | | | | + | | OSLSR_EL1 |
| - | | | | | + | | **SPSR_EL1..3** |
| - | | | | | + | | **SPSR_abt** |
| + | | **SPSR_fiq** | The Saved Program Status Register (FIQ mode) holds the saved process state when an exception is taken into FIQ mode. | | ||
| + | | **SPSR_irq** | Saved Program Status Register (IRQ mode) holds the saved process state when an exception is taken to IRQ mode. | | ||
| + | | **SPSR_und** | Saved Program Status Register (Undefined mode) holds the saved process state when an exception is taken to Undefined mode. | | ||
| + | | TFSRE0_EL1 | The Tag Fault Status Register (EL0) holds accumulated Tag Check Faults occurring in EL0 that are not taken precisely. | | ||
| + | | TFSR_EL1..3 | Tag Fault Status Register (EL1..EL3) holds accumulated Tag Check Faults occurring in EL1, EL2 or EL3 that are not taken precisely | | ||
| + | | TRCAUTHSTATUS | The Trace Authentication Status Register provides information about the authentication interface' | ||
| + | | TRCOSLSR | Trace OS Lock Status Register returns the status of the Trace OS Lock | | ||
| + | | TRCRSR | The Trace Resources Status Register is used to set or read the status of the resources. | | ||
| + | | TRCSSCSR< | ||
| + | | TRCSTATR | Trace Status Register returns the trace unit status. | | ||
| + | | VDISR_EL2..3 | Virtual Deferred Interrupt Status Register (EL2..EL3) Records that a SError exception has been consumed by an ESB instruction executed at EL1 or EL2. | | ||
| </ | </ | ||
| + | You can see how many states this processor has. Not all of them are used during program execution. Many registers are related to debugging and resource management. On the Raspberry Pi, the OS and bootloader have already configured all CPU Cores and the registers. Trace registers are used only when hardware debugging is enabled, such as JTAG or TRACE32. | ||
| + | Summarising: | ||
| + | |||
| + | There are rules for creating a Linux OS kernel module – it must contain functions that initialise the module and exit when the job is finished. The skeleton for the kernel module is given below in C. It will require a GCC compiler to compile the code, but inline assembly can be written directly in the code itself. After changing the Exception level from EL0 to EL1, only some system instruction executions will be allowed. | ||
| + | |||
| + | < | ||
| + | < | ||
| + | < | ||
| + | // mymod.c | ||
| + | #include < | ||
| + | #include < | ||
| + | |||
| + | static int __init mymod_init(void) | ||
| + | { | ||
| + | asm volatile( | ||
| + | "mrs x0, CurrentEL\n" | ||
| + | "lsr x0, x0, # | ||
| + | // x0 now contains current EL (expect 1) | ||
| + | ); | ||
| + | pr_info(" | ||
| + | return 0; | ||
| + | } | ||
| + | |||
| + | static void __exit mymod_exit(void) | ||
| + | { | ||
| + | pr_info(" | ||
| + | } | ||
| + | module_init(mymod_init); | ||
| + | module_exit(mymod_exit); | ||
| + | MODULE_LICENSE(" | ||
| + | </ | ||
| + | </ | ||
| + | There are restrictions on the use of privileged instructions in the code. In EL0, privileged instruction execution will trap into the kernel. Note that switching between EL0 and EL1 is allowed only in the kernel and firmware. The firmware code will require access to the whole chip documentation, | ||