| Both sides previous revisionPrevious revisionNext revision | Previous revision |
| en:multiasm:papc:chapter_6_11 [2026/02/27 02:32] – jtokarz | en:multiasm:papc:chapter_6_11 [2026/02/27 02:40] (current) – jtokarz |
|---|
| The main idea of vector data processing is shown in figure {{ref>mmxprocessing}}. It shows the example of an operation performed with packed word vector data. | The main idea of vector data processing is shown in figure {{ref>mmxprocessing}}. It shows the example of an operation performed with packed word vector data. |
| <figure mmxprocessing> | <figure mmxprocessing> |
| {{ :en:multiasm:cs:mmxprocessing.png?600 |Illustration of the idea of vector data processing}} | {{ :en:multiasm:cs:mmxprocessing.png?550 |Illustration of the idea of vector data processing}} |
| <caption>The idea of vector data processing</caption> | <caption>The idea of vector data processing</caption> |
| </figure> | </figure> |
| |
| <figure mmxpaddsw> | <figure mmxpaddsw> |
| {{ :en:multiasm:cs:mmxpaddsw.png?500 |Illustration of packed word addition with signed saturation}} | {{ :en:multiasm:cs:mmxpaddsw.png?450 |Illustration of packed word addition with signed saturation}} |
| <caption>The illustration of packed word addition with signed saturation</caption> | <caption>The illustration of packed word addition with signed saturation</caption> |
| </figure> | </figure> |
| |
| <figure mmxpaddusw> | <figure mmxpaddusw> |
| {{ :en:multiasm:cs:mmxpaddusw.png?500 |Illustration of packed word addition with unsigned saturation}} | {{ :en:multiasm:cs:mmxpaddusw.png?450 |Illustration of packed word addition with unsigned saturation}} |
| <caption>The illustration of packed word addition with unsigned saturation</caption> | <caption>The illustration of packed word addition with unsigned saturation</caption> |
| </figure> | </figure> |
| |
| <figure mmxmultiplyandunpck> | <figure mmxmultiplyandunpck> |
| {{ :en:multiasm:cs:mmxmultiplyandunpck.png?600 |Illustration of packed word multiplication and unpacking results to doublewords}} | {{ :en:multiasm:cs:mmxmultiplyandunpck.png?550 |Illustration of packed word multiplication and unpacking results to doublewords}} |
| <caption>The illustration of packed word multiplication and unpacking results to doublewords</caption> | <caption>The illustration of packed word multiplication and unpacking results to doublewords</caption> |
| </figure> | </figure> |
| |
| <figure mmxpmaddw> | <figure mmxpmaddw> |
| {{ :en:multiasm:cs:mmxpmaddw.png?600 |Illustration of packed word multiplication and sum to doublewords}} | {{ :en:multiasm:cs:mmxpmaddw.png?550 |Illustration of packed word multiplication and sum to doublewords}} |
| <caption>The illustration of packed word multiplication and sum to doublewords</caption> | <caption>The illustration of packed word multiplication and sum to doublewords</caption> |
| </figure> | </figure> |
| An example of comparison instruction for equality of two vectors of words is shown in figure {{ref>mmxcompare}}. | An example of comparison instruction for equality of two vectors of words is shown in figure {{ref>mmxcompare}}. |
| <figure mmxcompare> | <figure mmxcompare> |
| {{ :en:multiasm:cs:mmxcompare.png?500 |Illustration of vector data comparison}} | {{ :en:multiasm:cs:mmxcompare.png?450 |Illustration of vector data comparison}} |
| <caption>Vector data comparison</caption> | <caption>Vector data comparison</caption> |
| </figure> | </figure> |
| |
| <figure mmxpunpckhbw> | <figure mmxpunpckhbw> |
| {{ :en:multiasm:cs:mmxpunpkhbw.png?600 |Illustration of unpacking high-order bytes to words}} | {{ :en:multiasm:cs:mmxpunpkhbw.png?550 |Illustration of unpacking high-order bytes to words}} |
| <caption>The illustration of unpacking high-order bytes to words</caption> | <caption>The illustration of unpacking high-order bytes to words</caption> |
| </figure> | </figure> |
| |
| <figure mmxpunpcklbw> | <figure mmxpunpcklbw> |
| {{ :en:multiasm:cs:mmxpunpklbw.png?600 |Illustration of unpacking low-order bytes to words}} | {{ :en:multiasm:cs:mmxpunpklbw.png?550 |Illustration of unpacking low-order bytes to words}} |
| <caption>The illustration of unpacking low-order bytes to words</caption> | <caption>The illustration of unpacking low-order bytes to words</caption> |
| </figure> | </figure> |
| |
| <figure mmxpack> | <figure mmxpack> |
| {{ :en:multiasm:cs:mmxpacksswd.png?600 |Illustration of packing doublewords to words}} | {{ :en:multiasm:cs:mmxpacksswd.png?550 |Illustration of packing doublewords to words}} |
| <caption>The illustration of packing doublewords to words</caption> | <caption>The illustration of packing doublewords to words</caption> |
| </figure> | </figure> |
| The idea of vector and scalar operations is shown in figure {{ref>sse1vector}} and figure {{ref>sse1scalar}}, respectively. | The idea of vector and scalar operations is shown in figure {{ref>sse1vector}} and figure {{ref>sse1scalar}}, respectively. |
| <figure sse1vector> | <figure sse1vector> |
| {{ :en:multiasm:cs:sse1vector.png?600 |Illustration of the idea of SSE vector data processing}} | {{ :en:multiasm:cs:sse1vector.png?550 |Illustration of the idea of SSE vector data processing}} |
| <caption>The idea of vector data processing in SSE</caption> | <caption>The idea of vector data processing in SSE</caption> |
| </figure> | </figure> |
| <figure sse1scalar> | <figure sse1scalar> |
| {{ :en:multiasm:cs:sse1scalar.png?600 |Illustration of the idea of SSE scalar data processing}} | {{ :en:multiasm:cs:sse1scalar.png?550 |Illustration of the idea of SSE scalar data processing}} |
| <caption>The idea of scalar data processing in SSE</caption> | <caption>The idea of scalar data processing in SSE</caption> |
| </figure> | </figure> |
| |
| <figure sseunpack> | <figure sseunpack> |
| {{ :en:multiasm:cs:sseunpack.png?650 |Illustration of SSE unpacking single-precision floating-point values}} | {{ :en:multiasm:cs:sseunpack.png?550 |Illustration of SSE unpacking single-precision floating-point values}} |
| <caption>The illustration of SSE unpacking single-precision floating-point values</caption> | <caption>The illustration of SSE unpacking single-precision floating-point values</caption> |
| </figure> | </figure> |
| |
| <figure sseshuffle> | <figure sseshuffle> |
| {{ :en:multiasm:cs:sseshuffle.png?600 |Illustration of SSE shuffle single-precision floating-point values}} | {{ :en:multiasm:cs:sseshuffle.png?550 |Illustration of SSE shuffle single-precision floating-point values}} |
| <caption>The illustration of SSE shuffle single-precision floating-point values</caption> | <caption>The illustration of SSE shuffle single-precision floating-point values</caption> |
| </figure> | </figure> |
| |
| <figure sse2conversions> | <figure sse2conversions> |
| {{ :en:multiasm:cs:sse2conversions.png?600 |Illustration of a variety of data type conversion instructions}} | {{ :en:multiasm:cs:sse2conversions.png?550 |Illustration of a variety of data type conversion instructions}} |
| <caption>The illustration of a variety of data type conversion instructions</caption> | <caption>The illustration of a variety of data type conversion instructions</caption> |
| </figure> | </figure> |
| All horizontal instructions operate in a similar manner. The lower (bottom) part of the resulting vector is the result of operation on the bottom and top elements of the first (destination) operand; the higher (top) part of the resulting vector is the result of operation on the second (source) operand's bottom and top. The best way to present the principles of horizontal operations is a picture. Because in the subtraction operation the order of arguments is important, the **hsubpd** instruction is shown in figure {{ref>sse3hsubpd}}. | All horizontal instructions operate in a similar manner. The lower (bottom) part of the resulting vector is the result of operation on the bottom and top elements of the first (destination) operand; the higher (top) part of the resulting vector is the result of operation on the second (source) operand's bottom and top. The best way to present the principles of horizontal operations is a picture. Because in the subtraction operation the order of arguments is important, the **hsubpd** instruction is shown in figure {{ref>sse3hsubpd}}. |
| <figure sse3hsubpd> | <figure sse3hsubpd> |
| {{ :en:multiasm:cs:sse3hsubpd.png?600 |Illustration of a horizontal subtraction instruction}} | {{ :en:multiasm:cs:sse3hsubpd.png?550 |Illustration of a horizontal subtraction instruction}} |
| <caption>The illustration of a horizontal subtraction instruction</caption> | <caption>The illustration of a horizontal subtraction instruction</caption> |
| </figure> | </figure> |
| While there are more than two elements of source vectors, like in the **hsubps** instruction, it is also important to know the order of the elements in the resulting vector. Please look at the figure {{ref>sse3hsubps}}. | While there are more than two elements of source vectors, like in the **hsubps** instruction, it is also important to know the order of the elements in the resulting vector. Please look at the figure {{ref>sse3hsubps}}. |
| <figure sse3hsubps> | <figure sse3hsubps> |
| {{ :en:multiasm:cs:sse3hsubps.png?600 |Illustration of a horizontal single precision subtraction instruction}} | {{ :en:multiasm:cs:sse3hsubps.png?550 |Illustration of a horizontal single precision subtraction instruction}} |
| <caption>The illustration of a horizontal single precision subtraction instruction</caption> | <caption>The illustration of a horizontal single precision subtraction instruction</caption> |
| </figure> | </figure> |
| |
| <figure sse3palignr> | <figure sse3palignr> |
| {{ :en:multiasm:cs:sse3palignr.png?650 |Illustration of an aligned byte combine instruction}} | {{ :en:multiasm:cs:sse3palignr.png?600 |Illustration of an aligned byte combine instruction}} |
| <caption>The illustration of an aligned byte combine instruction</caption> | <caption>The illustration of an aligned byte combine instruction</caption> |
| </figure> | </figure> |