This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| en:multiasm:papc:chapter_6_16 [2025/11/22 14:17] – [Cache support instructions] ktokarz | en:multiasm:papc:chapter_6_16 [Unknown date] (current) – external edit (Unknown date) 127.0.0.1 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ====== Optimisation | + | ====== Optimisation ====== |
| Optimisation strongly depends on the microarchitecture of the processor. Some optimisation recommendations change together with new versions of processors. Producers usually publish the most up-to-date recommendations. The last release of the Intel documentation is " | Optimisation strongly depends on the microarchitecture of the processor. Some optimisation recommendations change together with new versions of processors. Producers usually publish the most up-to-date recommendations. The last release of the Intel documentation is " | ||
| A selection of specific optimisation recommendations is described in this section. | A selection of specific optimisation recommendations is described in this section. | ||
| Line 20: | Line 20: | ||
| - | ===== Cache support instructions | + | ===== Cache utilisation |
| In modern microarchitectures, | In modern microarchitectures, | ||
| The cache works on two main principles: | The cache works on two main principles: | ||
| Line 32: | Line 32: | ||
| * While processing big multidimensional data arrays, keep in mind their placement in memory (row-wise or column-wise), | * While processing big multidimensional data arrays, keep in mind their placement in memory (row-wise or column-wise), | ||
| * Object-oriented programming helps to utilise cache because members of the class are grouped. | * Object-oriented programming helps to utilise cache because members of the class are grouped. | ||
| + | ===== Cache temporal locality ===== | ||
| + | This feature helps improve performance in situations where the program uses the same variables repeatedly, e.g. in a loop. | ||
| + | In a situation where the data processed exceeds half the size of a level 1 cache, it is recommended to use the non-temporal data move instructions **movntq** and **movntdq** to store data from registers to memory. These instructions are hints to the processor to omit the cache if possible. It doesn' | ||
| + | ===== Cache support instructions ===== | ||
| There are also instructions which allow the programmer to support the processor with cache utilisation. | There are also instructions which allow the programmer to support the processor with cache utilisation. | ||
| * **movntq** saving the contents of the MMX register, bypassing cache | * **movntq** saving the contents of the MMX register, bypassing cache | ||
| Line 52: | Line 56: | ||
| - | ===== Cache temporal locality ===== | + | |
| - | This feature helps improve performance in situations where the program uses the same variables repeatedly, e.g. in a loop. | + | |
| - | In a situation where the data processed exceeds half the size of a level 1 cache, it is recommended to use the non-temporal data move instructions **movntq** and **movntdq** to store data from registers to memory. These instructions are hints to the processor to omit the cache if possible. It doesn' | + | |
| ===== Further reading ===== | ===== Further reading ===== | ||