1------------------------------------------------------------------------------ 2-- -- 3-- GNAT COMPILER COMPONENTS -- 4-- -- 5-- G N A T . A L T I V E C -- 6-- -- 7-- S p e c -- 8-- -- 9-- Copyright (C) 2004-2018, Free Software Foundation, Inc. -- 10-- -- 11-- GNAT is free software; you can redistribute it and/or modify it under -- 12-- terms of the GNU General Public License as published by the Free Soft- -- 13-- ware Foundation; either version 3, or (at your option) any later ver- -- 14-- sion. GNAT is distributed in the hope that it will be useful, but WITH- -- 15-- OUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY -- 16-- or FITNESS FOR A PARTICULAR PURPOSE. -- 17-- -- 18-- As a special exception under Section 7 of GPL version 3, you are granted -- 19-- additional permissions described in the GCC Runtime Library Exception, -- 20-- version 3.1, as published by the Free Software Foundation. -- 21-- -- 22-- You should have received a copy of the GNU General Public License and -- 23-- a copy of the GCC Runtime Library Exception along with this program; -- 24-- see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -- 25-- <http://www.gnu.org/licenses/>. -- 26-- -- 27-- GNAT was originally developed by the GNAT team at New York University. -- 28-- Extensive contributions were provided by Ada Core Technologies Inc. -- 29-- -- 30------------------------------------------------------------------------------ 31 32------------------------- 33-- General description -- 34------------------------- 35 36-- This is the root of a package hierarchy offering an Ada binding to the 37-- PowerPC AltiVec extensions, a set of 128bit vector types together with a 38-- set of subprograms operating on them. Relevant documents are: 39 40-- o AltiVec Technology, Programming Interface Manual (1999-06) 41-- to which we will refer as [PIM], describes the data types, the 42-- functional interface and the ABI conventions. 43 44-- o AltiVec Technology, Programming Environments Manual (2002-02) 45-- to which we will refer as [PEM], describes the hardware architecture 46-- and instruction set. 47 48-- These documents, as well as a number of others of general interest on the 49-- AltiVec technology, are available from the Motorola/AltiVec Web site at: 50 51-- http://www.freescale.com/altivec 52 53-- The binding interface is structured to allow alternate implementations: 54-- for real AltiVec capable targets, and for other targets. In the latter 55-- case, everything is emulated in software. The two versions are referred 56-- to as: 57 58-- o The Hard binding for AltiVec capable targets (with the appropriate 59-- hardware support and corresponding instruction set) 60 61-- o The Soft binding for other targets (with the low level primitives 62-- emulated in software). 63 64-- In addition, interfaces that are not strictly part of the base AltiVec API 65-- are provided, such as vector conversions to and from array representations, 66-- which are of interest for client applications (e.g. for vector 67-- initialization purposes). 68 69-- Only the soft binding is available today 70 71----------------------------------------- 72-- General package architecture survey -- 73----------------------------------------- 74 75-- The various vector representations are all "containers" of elementary 76-- values, the possible types of which are declared in this root package to 77-- be generally accessible. 78 79-- From the user standpoint, the binding materializes as a consistent 80-- hierarchy of units: 81 82-- GNAT.Altivec 83-- (component types) 84-- | 85-- o----------------o----------------o-------------o 86-- | | | | 87-- Vector_Types Vector_Operations Vector_Views Conversions 88 89-- Users can manipulate vectors through two families of types: Vector 90-- types and View types. 91 92-- Vector types are available through the Vector_Types and Vector_Operations 93-- packages, which implement the core binding to the AltiVec API, as 94-- described in [PIM-2.1 data types] and [PIM-4 AltiVec operations and 95-- predicates]. 96 97-- The layout of Vector objects is dependant on the target machine 98-- endianness, and View types were devised to offer a higher level user 99-- interface. With Views, a vector of 4 uints (1, 2, 3, 4) is always declared 100-- with a VUI_View := (Values => (1, 2, 3, 4)), element 1 first, natural 101-- notation to denote the element values, and indexed notation is available 102-- to access individual elements. 103 104-- View types do not represent Altivec vectors per se, in the sense that the 105-- Altivec_Operations are not available for them. They are intended to allow 106-- Vector initializations as well as access to the Vector component values. 107 108-- The GNAT.Altivec.Conversions package is provided to convert a View to the 109-- corresponding Vector and vice-versa. 110 111--------------------------- 112-- Underlying principles -- 113--------------------------- 114 115-- Internally, the binding relies on an abstraction of the Altivec API, a 116-- rich set of functions around a core of low level primitives mapping to 117-- AltiVec instructions. See for instance "vec_add" in [PIM-4.4 Generic and 118-- Specific AltiVec operations], with no less than six result/arguments 119-- combinations of byte vector types that map to "vaddubm". 120 121-- The "soft" version is a software emulation of the low level primitives. 122 123-- The "hard" version would map to real AltiVec instructions via GCC builtins 124-- and inlining. 125 126-- See the "Design Notes" section below for additional details on the 127-- internals. 128 129------------------- 130-- Example usage -- 131------------------- 132 133-- Here is a sample program declaring and initializing two vectors, 'add'ing 134-- them and displaying the result components: 135 136-- with GNAT.Altivec.Vector_Types; use GNAT.Altivec.Vector_Types; 137-- with GNAT.Altivec.Vector_Operations; use GNAT.Altivec.Vector_Operations; 138-- with GNAT.Altivec.Vector_Views; use GNAT.Altivec.Vector_Views; 139-- with GNAT.Altivec.Conversions; use GNAT.Altivec.Conversions; 140 141-- use GNAT.Altivec; 142 143-- with Ada.Text_IO; use Ada.Text_IO; 144 145-- procedure Sample is 146-- Va : Vector_Unsigned_Int := To_Vector ((Values => (1, 2, 3, 4))); 147-- Vb : Vector_Unsigned_Int := To_Vector ((Values => (1, 2, 3, 4))); 148 149-- Vs : Vector_Unsigned_Int; 150-- Vs_View : VUI_View; 151-- begin 152-- Vs := Vec_Add (Va, Vb); 153-- Vs_View := To_View (Vs); 154 155-- for I in Vs_View.Values'Range loop 156-- Put_Line (Unsigned_Int'Image (Vs_View.Values (I))); 157-- end loop; 158-- end; 159 160-- $ gnatmake sample.adb 161-- [...] 162-- $ ./sample 163-- 2 164-- 4 165-- 6 166-- 8 167 168------------------------------------------------------------------------------ 169 170with System; 171 172package GNAT.Altivec is 173 174 -- Definitions of constants and vector/array component types common to all 175 -- the versions of the binding. 176 177 -- All the vector types are 128bits 178 179 VECTOR_BIT : constant := 128; 180 181 ------------------------------------------- 182 -- [PIM-2.3.1 Alignment of vector types] -- 183 ------------------------------------------- 184 185 -- "A defined data item of any vector data type in memory is always 186 -- aligned on a 16-byte boundary. A pointer to any vector data type always 187 -- points to a 16-byte boundary. The compiler is responsible for aligning 188 -- vector data types on 16-byte boundaries." 189 190 VECTOR_ALIGNMENT : constant := Natural'Min (16, Standard'Maximum_Alignment); 191 -- This value is used to set the alignment of vector datatypes in both the 192 -- hard and the soft binding implementations. 193 -- 194 -- We want this value to never be greater than 16, because none of the 195 -- binding implementations requires larger alignments and such a value 196 -- would cause useless space to be allocated/wasted for vector objects. 197 -- Furthermore, the alignment of 16 matches the hard binding leading to 198 -- a more faithful emulation. 199 -- 200 -- It needs to be exactly 16 for the hard binding, and the initializing 201 -- expression is just right for this purpose since Maximum_Alignment is 202 -- expected to be 16 for the real Altivec ABI. 203 -- 204 -- The soft binding doesn't rely on strict 16byte alignment, and we want 205 -- the value to be no greater than Standard'Maximum_Alignment in this case 206 -- to ensure it is supported on every possible target. 207 208 ------------------------------------------------------- 209 -- [PIM-2.1] Data Types - Interpretation of contents -- 210 ------------------------------------------------------- 211 212 --------------------- 213 -- char components -- 214 --------------------- 215 216 CHAR_BIT : constant := 8; 217 SCHAR_MIN : constant := -2 ** (CHAR_BIT - 1); 218 SCHAR_MAX : constant := 2 ** (CHAR_BIT - 1) - 1; 219 UCHAR_MAX : constant := 2 ** CHAR_BIT - 1; 220 221 type unsigned_char is mod UCHAR_MAX + 1; 222 for unsigned_char'Size use CHAR_BIT; 223 224 type signed_char is range SCHAR_MIN .. SCHAR_MAX; 225 for signed_char'Size use CHAR_BIT; 226 227 subtype bool_char is unsigned_char; 228 -- ??? There is a difference here between what the Altivec Technology 229 -- Programming Interface Manual says and what GCC says. In the manual, 230 -- vector_bool_char is a vector_unsigned_char, while in altivec.h it 231 -- is a vector_signed_char. 232 233 bool_char_True : constant bool_char := bool_char'Last; 234 bool_char_False : constant bool_char := 0; 235 236 ---------------------- 237 -- short components -- 238 ---------------------- 239 240 SHORT_BIT : constant := 16; 241 SSHORT_MIN : constant := -2 ** (SHORT_BIT - 1); 242 SSHORT_MAX : constant := 2 ** (SHORT_BIT - 1) - 1; 243 USHORT_MAX : constant := 2 ** SHORT_BIT - 1; 244 245 type unsigned_short is mod USHORT_MAX + 1; 246 for unsigned_short'Size use SHORT_BIT; 247 248 subtype unsigned_short_int is unsigned_short; 249 250 type signed_short is range SSHORT_MIN .. SSHORT_MAX; 251 for signed_short'Size use SHORT_BIT; 252 253 subtype signed_short_int is signed_short; 254 255 subtype bool_short is unsigned_short; 256 -- ??? See bool_char 257 258 bool_short_True : constant bool_short := bool_short'Last; 259 bool_short_False : constant bool_short := 0; 260 261 subtype bool_short_int is bool_short; 262 263 -------------------- 264 -- int components -- 265 -------------------- 266 267 INT_BIT : constant := 32; 268 SINT_MIN : constant := -2 ** (INT_BIT - 1); 269 SINT_MAX : constant := 2 ** (INT_BIT - 1) - 1; 270 UINT_MAX : constant := 2 ** INT_BIT - 1; 271 272 type unsigned_int is mod UINT_MAX + 1; 273 for unsigned_int'Size use INT_BIT; 274 275 type signed_int is range SINT_MIN .. SINT_MAX; 276 for signed_int'Size use INT_BIT; 277 278 subtype bool_int is unsigned_int; 279 -- ??? See bool_char 280 281 bool_int_True : constant bool_int := bool_int'Last; 282 bool_int_False : constant bool_int := 0; 283 284 ---------------------- 285 -- float components -- 286 ---------------------- 287 288 FLOAT_BIT : constant := 32; 289 FLOAT_DIGIT : constant := 6; 290 FLOAT_MIN : constant := -16#0.FFFF_FF#E+32; 291 FLOAT_MAX : constant := 16#0.FFFF_FF#E+32; 292 293 type C_float is digits FLOAT_DIGIT range FLOAT_MIN .. FLOAT_MAX; 294 for C_float'Size use FLOAT_BIT; 295 -- Altivec operations always use the standard native floating-point 296 -- support of the target. Note that this means that there may be 297 -- minor differences in results between targets when the floating- 298 -- point implementations are slightly different, as would happen 299 -- with normal non-Altivec floating-point operations. In particular 300 -- the Altivec simulations may yield slightly different results 301 -- from those obtained on a true hardware Altivec target if the 302 -- floating-point implementation is not 100% compatible. 303 304 ---------------------- 305 -- pixel components -- 306 ---------------------- 307 308 subtype pixel is unsigned_short; 309 310 ----------------------------------------------------------- 311 -- Subtypes for variants found in the GCC implementation -- 312 ----------------------------------------------------------- 313 314 subtype c_int is signed_int; 315 subtype c_short is c_int; 316 317 LONG_BIT : constant := 32; 318 -- Some of the GCC builtins are built with "long" arguments and 319 -- expect SImode to come in. 320 321 SLONG_MIN : constant := -2 ** (LONG_BIT - 1); 322 SLONG_MAX : constant := 2 ** (LONG_BIT - 1) - 1; 323 ULONG_MAX : constant := 2 ** LONG_BIT - 1; 324 325 type signed_long is range SLONG_MIN .. SLONG_MAX; 326 type unsigned_long is mod ULONG_MAX + 1; 327 328 subtype c_long is signed_long; 329 330 subtype c_ptr is System.Address; 331 332 --------------------------------------------------------- 333 -- Access types, for the sake of some argument passing -- 334 --------------------------------------------------------- 335 336 type signed_char_ptr is access all signed_char; 337 type unsigned_char_ptr is access all unsigned_char; 338 339 type short_ptr is access all c_short; 340 type signed_short_ptr is access all signed_short; 341 type unsigned_short_ptr is access all unsigned_short; 342 343 type int_ptr is access all c_int; 344 type signed_int_ptr is access all signed_int; 345 type unsigned_int_ptr is access all unsigned_int; 346 347 type long_ptr is access all c_long; 348 type signed_long_ptr is access all signed_long; 349 type unsigned_long_ptr is access all unsigned_long; 350 351 type float_ptr is access all Float; 352 353 -- 354 355 type const_signed_char_ptr is access constant signed_char; 356 type const_unsigned_char_ptr is access constant unsigned_char; 357 358 type const_short_ptr is access constant c_short; 359 type const_signed_short_ptr is access constant signed_short; 360 type const_unsigned_short_ptr is access constant unsigned_short; 361 362 type const_int_ptr is access constant c_int; 363 type const_signed_int_ptr is access constant signed_int; 364 type const_unsigned_int_ptr is access constant unsigned_int; 365 366 type const_long_ptr is access constant c_long; 367 type const_signed_long_ptr is access constant signed_long; 368 type const_unsigned_long_ptr is access constant unsigned_long; 369 370 type const_float_ptr is access constant Float; 371 372 -- Access to const volatile arguments need specialized types 373 374 type volatile_float is new Float; 375 pragma Volatile (volatile_float); 376 377 type volatile_signed_char is new signed_char; 378 pragma Volatile (volatile_signed_char); 379 380 type volatile_unsigned_char is new unsigned_char; 381 pragma Volatile (volatile_unsigned_char); 382 383 type volatile_signed_short is new signed_short; 384 pragma Volatile (volatile_signed_short); 385 386 type volatile_unsigned_short is new unsigned_short; 387 pragma Volatile (volatile_unsigned_short); 388 389 type volatile_signed_int is new signed_int; 390 pragma Volatile (volatile_signed_int); 391 392 type volatile_unsigned_int is new unsigned_int; 393 pragma Volatile (volatile_unsigned_int); 394 395 type volatile_signed_long is new signed_long; 396 pragma Volatile (volatile_signed_long); 397 398 type volatile_unsigned_long is new unsigned_long; 399 pragma Volatile (volatile_unsigned_long); 400 401 type constv_char_ptr is access constant volatile_signed_char; 402 type constv_signed_char_ptr is access constant volatile_signed_char; 403 type constv_unsigned_char_ptr is access constant volatile_unsigned_char; 404 405 type constv_short_ptr is access constant volatile_signed_short; 406 type constv_signed_short_ptr is access constant volatile_signed_short; 407 type constv_unsigned_short_ptr is access constant volatile_unsigned_short; 408 409 type constv_int_ptr is access constant volatile_signed_int; 410 type constv_signed_int_ptr is access constant volatile_signed_int; 411 type constv_unsigned_int_ptr is access constant volatile_unsigned_int; 412 413 type constv_long_ptr is access constant volatile_signed_long; 414 type constv_signed_long_ptr is access constant volatile_signed_long; 415 type constv_unsigned_long_ptr is access constant volatile_unsigned_long; 416 417 type constv_float_ptr is access constant volatile_float; 418 419private 420 421 ----------------------- 422 -- Various constants -- 423 ----------------------- 424 425 CR6_EQ : constant := 0; 426 CR6_EQ_REV : constant := 1; 427 CR6_LT : constant := 2; 428 CR6_LT_REV : constant := 3; 429 430end GNAT.Altivec; 431 432-------------------- 433-- Design Notes -- 434-------------------- 435 436------------------------ 437-- General principles -- 438------------------------ 439 440-- The internal organization has been devised from a number of driving ideas: 441 442-- o From the clients standpoint, the two versions of the binding should be 443-- as easily exchangable as possible, 444 445-- o From the maintenance standpoint, we want to avoid as much code 446-- duplication as possible. 447 448-- o From both standpoints above, we want to maintain a clear interface 449-- separation between the base bindings to the Motorola API and the 450-- additional facilities. 451 452-- The identification of the low level interface is directly inspired by the 453-- the base API organization, basically consisting of a rich set of functions 454-- around a core of low level primitives mapping to AltiVec instructions. 455 456-- See for instance "vec_add" in [PIM-4.4 Generic and Specific AltiVec 457-- operations]: no less than six result/arguments combinations of byte vector 458-- types map to "vaddubm". 459 460-- The "hard" version of the low level primitives map to real AltiVec 461-- instructions via the corresponding GCC builtins. The "soft" version is 462-- a software emulation of those. 463 464--------------------------------------- 465-- The Low_Level_Vectors abstraction -- 466--------------------------------------- 467 468-- The AltiVec C interface spirit is to map a large set of C functions down 469-- to a much smaller set of AltiVec instructions, most of them operating on a 470-- set of vector data types in a transparent manner. See for instance the 471-- case of vec_add, which maps six combinations of result/argument types to 472-- vaddubm for signed/unsigned/bool variants of 'char' components. 473 474-- The GCC implementation of this idiom for C/C++ is to setup builtins 475-- corresponding to the instructions and to expose the C user function as 476-- wrappers around those builtins with no-op type conversions as required. 477-- Typically, for the vec_add case mentioned above, we have (altivec.h): 478-- 479-- inline __vector signed char 480-- vec_add (__vector signed char a1, __vector signed char a2) 481-- { 482-- return (__vector signed char) 483-- __builtin_altivec_vaddubm ((__vector signed char) a1, 484-- (__vector signed char) a2); 485-- } 486 487-- inline __vector unsigned char 488-- vec_add (__vector __bool char a1, __vector unsigned char a2) 489-- { 490-- return (__vector unsigned char) 491-- __builtin_altivec_vaddubm ((__vector signed char) a1, 492-- (__vector signed char) a2); 493-- } 494 495-- The central idea for the Ada bindings is to leverage on the existing GCC 496-- architecture, with the introduction of a Low_Level_Vectors abstraction. 497-- This abstaction acts as a representative of the vector-types and builtins 498-- compiler interface for either the Hard or the Soft case. 499 500-- For the Hard binding, Low_Level_Vectors exposes data types with a GCC 501-- internal translation identical to the "vector ..." C types, and a set of 502-- subprograms mapping straight to the internal GCC builtins. 503 504-- For the Soft binding, Low_Level_Vectors exposes the same set of types 505-- and subprograms, with bodies simulating the instructions behavior. 506 507-- Vector_Types/Operations "simply" bind the user types and operations to 508-- some Low_Level_Vectors implementation, selected in accordance with the 509-- target 510 511-- To achieve a complete Hard/Soft independence in the Vector_Types and 512-- Vector_Operations implementations, both versions of the low level support 513-- are expected to expose a number of facilities: 514 515-- o Private data type declarations for base vector representations embedded 516-- in the user visible vector types, that is: 517 518-- LL_VBC, LL_VUC and LL_VSC 519-- for vector_bool_char, vector_unsigned_char and vector_signed_char 520 521-- LL_VBS, LL_VUS and LL_VSS 522-- for vector_bool_short, vector_unsigned_short and vector_signed_short 523 524-- LL_VBI, LL_VUI and LL_VSI 525-- for vector_bool_int, vector_unsigned_int and vector_signed_int 526 527-- as well as: 528 529-- LL_VP for vector_pixel and LL_VF for vector_float 530 531-- o Primitive operations corresponding to the AltiVec hardware instruction 532-- names, like "vaddubm". The whole set is not described here. The actual 533-- sets are inspired from the GCC builtins which are invoked from GCC's 534-- "altivec.h". 535 536-- o An LL_Altivec convention identifier, specifying the calling convention 537-- to be used to access the aforementioned primitive operations. 538 539-- Besides: 540 541-- o Unchecked_Conversion are expected to be allowed between any pair of 542-- exposed data types, and are expected to have no effect on the value 543-- bit patterns. 544 545------------------------- 546-- Vector views layout -- 547------------------------- 548 549-- Vector Views combine intuitive user level ordering for both elements 550-- within a vector and bytes within each element. They basically map to an 551-- array representation where array(i) always represents element (i), in the 552-- natural target representation. This way, a user vector (1, 2, 3, 4) is 553-- represented as: 554 555-- Increasing Addresses 556-- -------------------------------------------------------------------------> 557 558-- | 0x0 0x0 0x0 0x1 | 0x0 0x0 0x0 0x2 | 0x0 0x0 0x0 0x3 | 0x0 0x0 0x0 0x4 | 559-- | V (0), BE | V (1), BE | V (2), BE | V (3), BE | 560 561-- on a big endian target, and as: 562 563-- | 0x1 0x0 0x0 0x0 | 0x2 0x0 0x0 0x0 | 0x3 0x0 0x0 0x0 | 0x4 0x0 0x0 0x0 | 564-- | V (0), LE | V (1), LE | V (2), LE | V (3), LE | 565 566-- on a little-endian target 567 568------------------------- 569-- Vector types layout -- 570------------------------- 571 572-- In the case of the hard binding, the layout of the vector type in 573-- memory is documented by the Altivec documentation. In the case of the 574-- soft binding, the simplest solution is to represent a vector as an 575-- array of components. This representation can depend on the endianness. 576-- We can consider three possibilities: 577 578-- * First component at the lowest address, components in big endian format. 579-- It is the natural way to represent an array in big endian, and it would 580-- also be the natural way to represent a quad-word integer in big endian. 581 582-- Example: 583 584-- Let V be a vector of unsigned int which value is (1, 2, 3, 4). It is 585-- represented as: 586 587-- Addresses growing 588-- -------------------------------------------------------------------------> 589-- | 0x0 0x0 0x0 0x1 | 0x0 0x0 0x0 0x2 | 0x0 0x0 0x0 0x3 | 0x0 0x0 0x0 0x4 | 590-- | V (0), BE | V (1), BE | V (2), BE | V (3), BE | 591 592-- * First component at the lowest address, components in little endian 593-- format. It is the natural way to represent an array in little endian. 594 595-- Example: 596 597-- Let V be a vector of unsigned int which value is (1, 2, 3, 4). It is 598-- represented as: 599 600-- Addresses growing 601-- -------------------------------------------------------------------------> 602-- | 0x1 0x0 0x0 0x0 | 0x2 0x0 0x0 0x0 | 0x3 0x0 0x0 0x0 | 0x4 0x0 0x0 0x0 | 603-- | V (0), LE | V (1), LE | V (2), LE | V (3), LE | 604 605-- * Last component at the lowest address, components in little endian format. 606-- It is the natural way to represent a quad-word integer in little endian. 607 608-- Example: 609 610-- Let V be a vector of unsigned int which value is (1, 2, 3, 4). It is 611-- represented as: 612 613-- Addresses growing 614-- -------------------------------------------------------------------------> 615-- | 0x4 0x0 0x0 0x0 | 0x3 0x0 0x0 0x0 | 0x2 0x0 0x0 0x0 | 0x1 0x0 0x0 0x0 | 616-- | V (3), LE | V (2), LE | V (1), LE | V (0), LE | 617 618-- There is actually a fourth case (components in big endian, first 619-- component at the lowest address), but it does not have any interesting 620-- properties: it is neither the natural way to represent a quad-word on any 621-- machine, nor the natural way to represent an array on any machine. 622 623-- Example: 624 625-- Let V be a vector of unsigned int which value is (1, 2, 3, 4). It is 626-- represented as: 627 628-- Addresses growing 629-- -------------------------------------------------------------------------> 630-- | 0x0 0x0 0x0 0x4 | 0x0 0x0 0x0 0x3 | 0x0 0x0 0x0 0x2 | 0x0 0x0 0x0 0x1 | 631-- | V (3), BE | V (2), BE | V (1), BE | V (0), BE | 632 633-- Most of the Altivec operations are specific to a component size, and 634-- can be implemented with any of these three formats. But some operations 635-- are defined by the same Altivec primitive operation for different type 636-- sizes: 637 638-- * operations doing arithmetics on a complete vector, seen as a quad-word; 639-- * operations dealing with memory. 640 641-- Operations on a complete vector: 642-- -------------------------------- 643 644-- Examples: 645 646-- vec_sll/vsl : shift left on the entire vector. 647-- vec_slo/vslo: shift left on the entire vector, by octet. 648 649-- Those operations works on vectors seens as a quad-word. 650-- Let us suppose that we have a conversion operation named To_Quad_Word 651-- for converting vector types to a quad-word. 652 653-- Let A be a Altivec vector of 16 components: 654-- A = (A(0), A(1), A(2), A(3), ... , A(14), A(15)) 655-- Let B be a Altivec vector of 8 components verifying: 656-- B = (A(0) |8| A(1), A(2) |8| A(3), ... , A(14) |8| A(15)) 657-- Let C be a Altivec vector of 4 components verifying: 658-- C = (A(0) |8| A(1) |8| A(2) |8| A(3), ... , 659-- A(12) |8| A(13) |8| A(14) |8| A(15)) 660 661-- (definition: |8| is the concatenation operation between two bytes; 662-- i.e. 0x1 |8| 0x2 = 0x0102) 663 664-- According to [PIM - 4.2 byte ordering], we have the following property: 665-- To_Quad_Word (A) = To_Quad_Word (B) = To_Quad_Word (C) 666 667-- Let To_Type_Of_A be a conversion operation from the type of B to the 668-- type of A. The quad-word operations are only implemented by one 669-- Altivec primitive operation. That means that, if QW_Operation is a 670-- quad-word operation, we should have: 671-- QW_Operation (To_Type_Of_A (B)) = QW_Operation (A) 672 673-- That is true iff: 674-- To_Quad_Word (To_Type_Of_A (B)) = To_Quad_Word (A) 675 676-- As To_Quad_Word is a bijection. we have: 677-- To_Type_Of_A (B) = A 678 679-- resp. any combination of A, B, C: 680-- To_Type_Of_A (C) = A 681-- To_Type_Of_B (A) = B 682-- To_Type_Of_C (B) = C 683-- ... 684 685-- Making sure that the properties described above are verified by the 686-- conversion operations between vector types has different implications 687-- depending on the layout of the vector types: 688-- * with format 1 and 3: only a unchecked conversion is needed; 689-- * with format 2 and 4: some reorganisation is needed for conversions 690-- between vector types with different component sizes; that has a cost on the 691-- efficiency, plus the complexity of having different memory pattern for 692-- the same quad-word value, depending on the type. 693 694-- Operation dealing with memory: 695-- ------------------------------ 696 697-- These operations are either load operation (vec_ld and the 698-- corresponding primitive operation: vlx) or store operation (vec_st 699-- and the corresponding primitive operation: vstx). 700 701-- According to [PIM 4.4 - vec_ld], those operations take in input 702-- either an access to a vector (e.g. a const_vector_unsigned_int_ptr) 703-- or an access to a flow of components (e.g. a const_unsigned_int_ptr), 704-- relying on the same Altivec primitive operations. That means that both 705-- should have the same representation in memory. 706 707-- For the stream, it is easier to adopt the format of the target. That 708-- means that, in memory, the components of the vector should also have the 709-- format of the target. meaning that we will prefer: 710-- * On a big endian target: format 1 or 4 711-- * On a little endian target: format 2 or 3 712 713-- Conclusion: 714-- ----------- 715 716-- To take into consideration the constraint brought about by the routines 717-- operating on quad-words and the routines operating on memory, the best 718-- choice seems to be: 719 720-- * On a big endian target: format 1; 721-- * On a little endian target: format 3. 722 723-- Those layout choices are enforced by GNAT.Altivec.Low_Level_Conversions, 724-- which is the endianness-dependant unit providing conversions between 725-- vector views and vector types. 726 727---------------------- 728-- Layouts summary -- 729---------------------- 730 731-- For a user abstract vector of 4 uints (1, 2, 3, 4), increasing 732-- addresses from left to right: 733 734-- ========================================================================= 735-- BIG ENDIAN TARGET MEMORY LAYOUT for (1, 2, 3, 4) 736-- ========================================================================= 737 738-- View 739-- ------------------------------------------------------------------------- 740-- | 0x0 0x0 0x0 0x1 | 0x0 0x0 0x0 0x2 | 0x0 0x0 0x0 0x3 | 0x0 0x0 0x0 0x4 | 741-- | V (0), BE | V (1), BE | V (2), BE | V (3), BE | 742-- ------------------------------------------------------------------------- 743 744-- Vector 745-- ------------------------------------------------------------------------- 746-- | 0x0 0x0 0x0 0x1 | 0x0 0x0 0x0 0x2 | 0x0 0x0 0x0 0x3 | 0x0 0x0 0x0 0x4 | 747-- | V (0), BE | V (1), BE | V (2), BE | V (3), BE | 748-- ------------------------------------------------------------------------- 749 750-- ========================================================================= 751-- LITTLE ENDIAN TARGET MEMORY LAYOUT for (1, 2, 3, 4) 752-- ========================================================================= 753 754-- View 755-- ------------------------------------------------------------------------- 756-- | 0x1 0x0 0x0 0x0 | 0x2 0x0 0x0 0x0 | 0x3 0x0 0x0 0x0 | 0x4 0x0 0x0 0x0 | 757-- | V (0), LE | V (1), LE | V (2), LE | V (3), LE | 758 759-- Vector 760-- ------------------------------------------------------------------------- 761-- | 0x4 0x0 0x0 0x0 | 0x3 0x0 0x0 0x0 | 0x2 0x0 0x0 0x0 | 0x1 0x0 0x0 0x0 | 762-- | V (3), LE | V (2), LE | V (1), LE | V (0), LE | 763-- ------------------------------------------------------------------------- 764 765-- These layouts are common to both the soft and hard implementations on 766-- Altivec capable targets. 767