Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:multiasm:papc:chapter_6_2 [2025/04/25 09:14] – [Addressing in x64 processors] ktokarzen:multiasm:papc:chapter_6_2 [2026/05/27 09:58] (current) – [Segmented addressing in protected mode] ktokarz
Line 3: Line 3:
  
 ===== Segmented addressing in real mode ===== ===== Segmented addressing in real mode =====
-The 8086 can address the memory in so-called real mode only. In this mode, the address is calculated with two 16-bit elements: segment and offset. The 8086 implements four special registers to store the segment part of the address: CS, DS, ES, and SS. During program execution, all addresses are calculated relative to one of these registers. The program is divided into three segments containing the main elements. The code segment contains processor instructions and their immediate operands. The instructions address are related to the CS register. The data segment is related to the DS register. It contains data allocated by the program. The stack segment contains the program stack and is related to the SS register. If needed, it is possible to use an extra segment related to the ES register. It is by default used by string instructions.+The 8086 can address the memory in so-called real mode only. In this mode, the address is calculated with two 16-bit elements: segment and offset. The 8086 implements four special registers to store the segment part of the address: CS, DS, ES, and SS. During program execution, all addresses are calculated relative to one of these registers. The program is divided into three segments containing the main elements. The code segment contains processor instructions and their immediate operands. The instructions' addresses are related to the CS register. The data segment is related to the DS register. It contains data allocated by the program. The stack segment contains the program stack and is related to the SS register. If needed, it is possible to use an extra segment related to the ES register. It is used by default by string instructions.
 <figure realsegments> <figure realsegments>
 {{ :en:multiasm:cs:Real_segments.png?600 |Illustration of assignment of segments and segment registers in real mode}} {{ :en:multiasm:cs:Real_segments.png?600 |Illustration of assignment of segments and segment registers in real mode}}
 <caption>Segments and segment registers in real mode</caption> <caption>Segments and segment registers in real mode</caption>
-</figure>+</figure>'
  
 Although the 8086 processor has only four segment registers, there can be many segments defined in the program. The limitation is that the processor can access only four of them at the same time, as presented in Fig {{ref>realsegments}}. To access other segments, it must change the content of the segment register. Although the 8086 processor has only four segment registers, there can be many segments defined in the program. The limitation is that the processor can access only four of them at the same time, as presented in Fig {{ref>realsegments}}. To access other segments, it must change the content of the segment register.
  
-The address, which consists of two elements, the segment and the offset, is named a logical address. Both numbers which form a logical address are 16-bit numbers. So, how to calculate a 20-bit address with two 16-bit values? It is done in the following way. The segment part, taken always from the chosen segment register, is shifted four bit positions left. Four bits at the right side are filled with zeros, forming a 20-bit value. The offset value is added to the result of the shift. The result of the calculations is named the linear address. It is presented the Fig {{ref>realcalc}}. In the 8086 processor, the linear address equals the physical address, which is provided via the address bus to the memory of the computer. +The address, which consists of two elements, the segment and the offset, is named a logical address. Both numbers which form a logical address are 16-bit numbers. So, how to calculate a 20-bit address with two 16-bit values? It is done in the following way. The segment part, taken always from the chosen segment register, is shifted four bit positions left. Four bits on the right side are filled with zeros, forming a 20-bit value. The offset value is added to the result of the shift. The result of the calculations is named the linear address. It is presented in the Fig {{ref>realcalc}}. In the 8086 processor, the linear address equals the physical address, which is provided via the address bus to the memory of the computer. 
  
 <figure realcalc> <figure realcalc>
Line 36: Line 36:
 </figure> </figure>
  
-Although segmentation allows for advanced memory management and the implementation of memory protection, none of the popular operating systems, including Windows, Linux or MacOS, ever used it. In Windows, all segment registers, via descriptors, point to the zero base address and the maximal limit, resulting in the flat memory model. In this approach, the segmentation de facto does not function. Memory protection is implemented at the paging level. The flat memory model is shown in the Fig {{ref>ia32flat}}.+Although segmentation allows for advanced memory management and the implementation of memory protection, none of the popular operating systems, including Windows, Linux-based, or MacOS, has ever used it. In Windows, all segment registers, via descriptors, point to the zero base address and the maximal limit, resulting in the flat memory model. In this approach, the segmentation de facto does not function. Memory protection is implemented at the paging level. The flat memory model is shown in the Fig {{ref>ia32flat}}.
  
 <figure ia32flat> <figure ia32flat>
Line 45: Line 45:
 ===== Paging ===== ===== Paging =====
  
 +Paging is a mechanism for translating linear addresses to physical addresses. Operating systems extensively use it to implement memory protection and virtual memory mechanisms. The virtual memory mechanism allows the operating system to use the entire address space with less physical memory installed in the computer. The paging is supported by the memory management unit using a structure of tables, stored in memory.
 +In 32-bit mode, the tables are organised in a two-level structure, with a single page directory table and a set of page tables. The pages can be 4kB or 4MB in size. The 1-bit information about the page size (named PS) is stored in the page directory entry.
 +
 +  * 4kB pages
 +Each entry in the page directory holds a pointer to the page table, and every page table stores the pointers to the pages. 
 +The linear address is divided into three parts. The first (highest) part is a 10-bit index of the entry in the page directory table, the middle 10 bits form an index of the entry in the page table, and finally, the last (least significant) 12 bits are the offset within the table in memory. This mechanism is shown in Fig {{ref>paging324kB}}.
 +
 +<figure paging324kB>
 +{{ :en:multiasm:cs:paging_32_4kB.png?600 |Illustration of paging in 32-bit mode with 4kB pages}}
 +<caption>Paging in 32-bit mode with 4kB pages</caption>
 +</figure>
 +
 +  * 4MB pages
 +For pages of 4MB in size, the page table level is omitted. The page directory holds the direct pointer to the table in memory. In this situation, the entry in the page directory is indexed with 10 bits, and the offset within the page is 22 bits long. It is shown in the Fig {{ref>paging324MB}}.
 +
 +<figure paging324MB>
 +{{ :en:multiasm:cs:paging_32_4MB.png?600 |Illustration of paging in 32-bit mode with 4MB pages}}
 +<caption>Paging in 32-bit mode with 4MB pages</caption>
 +</figure>
 +
 +In 64-bit mode, the tables are organised in a four or five-level structure. In a 4-level structure, the highest level is a single page map level 4 table, next, there are page directory pointer tables, page directory tables, and page tables. The linear address is divided into six parts. Each table is indexed with 9 bits and can store up to 512 entries. It is shown in the Fig {{ref>paging644kB}}. The pages can be 4kB, 2MB, or 1GB in size. For 2MB pages, the page tables level is omitted, while for 1GB pages, the page directory level is additionally omitted.
 +
 +<figure paging644kB>
 +{{ :en:multiasm:cs:paging_64_4kB.png?600 |Illustration of paging in 64-bit mode with 4kB pages}}
 +<caption>Paging in 64-bit mode with 4kB pages</caption>
 +</figure>
 +
 +As currently built computers have significantly less physical memory than can be theoretically addressed with a 64-bit linear address, producers decided to limit the usable address space. That's why the 4-level paging mechanism recognises only 48 bits, leaving the upper bits unused. In 5-level paging, 57 bits are recognised. The most significant part of the address should have the value of the highest recognisable bit from the address. As the most significant bit of the number represents the sign, duplicating this bit is known as sign extension.
 +
 +Sign-extended addresses having 48 or 57 recognised bits are known as canonical addresses, while others are non-canonical. It is presented in Fig {{ref>canonical}}. In current machines, only canonical addresses are valid.
 +
 +<figure canonical>
 +{{ :en:multiasm:cs:canonical.png?600 |Illustration of canonical addresses in 64-bit mode}}
 +<caption>Canonical addresses in 64-bit mode</caption>
 +</figure>
 ===== Addressing in x64 processors ===== ===== Addressing in x64 processors =====
 Because the segmentation wasn't used by operating systems software producers, AMD and Intel decided to abandon segmentation in 64-bit processors. For backwards compatibility, modern processors can run 32-bit mode with segmentation, but the newest versions of operating systems use 64-bit addressing, named long mode and referred to as x64. The only possible addressing is a flat memory model with segment registers and descriptors set to zero address as the beginning of the linear memory space. Because the segmentation wasn't used by operating systems software producers, AMD and Intel decided to abandon segmentation in 64-bit processors. For backwards compatibility, modern processors can run 32-bit mode with segmentation, but the newest versions of operating systems use 64-bit addressing, named long mode and referred to as x64. The only possible addressing is a flat memory model with segment registers and descriptors set to zero address as the beginning of the linear memory space.
Line 50: Line 85:
 In x64 bit mode, some instructions can use segment registers FS and GS as base registers. Operating systems' kernels also use them. In x64 bit mode, some instructions can use segment registers FS and GS as base registers. Operating systems' kernels also use them.
 </note> </note>
-The 64-bit mode theoretically allows the processor to address a vast memory of 16 exabytes in size. It is expected that such a big memory will not be installed in currently built computers, so the processors limit the available address space.+The 64-bit mode theoretically allows the processor to address a vast memory of 16 exabytes in size. It is expected that such a big memory will not be installed in currently built computers, so the processors limit the available address space, not only at the paging level but also physically, having a limited number of address bus lines.
en/multiasm/papc/chapter_6_2.1745561683.txt.gz · Last modified: by ktokarz
CC Attribution-Share Alike 4.0 International
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0