1====================== 2Clang Offload Packager 3====================== 4 5.. contents:: 6 :local: 7 8.. _clang-offload-packager: 9 10Introduction 11============ 12 13This tool bundles device files into a single image containing necessary 14metadata. We use a custom binary format for bundling all the device images 15together. The image format is a small header wrapping around a string map. This 16tool creates bundled binaries so that they can be embedded into the host to 17create a fat-binary. 18 19Binary Format 20============= 21 22The binary format is marked by the ``0x10FF10AD`` magic bytes, followed by a 23version. Each created binary contains its own magic bytes. This allows us to 24locate all the embedded offloading sections even after they may have been merged 25by the linker, such as when using relocatable linking. Conceptually, this binary 26format is a serialization of a string map and an image buffer. The binary header 27is described in the following :ref:`table<table-binary_header>`. 28 29.. table:: Offloading Binary Header 30 :name: table-binary_header 31 32 +----------+--------------+----------------------------------------------------+ 33 | Type | Identifier | Description | 34 +==========+==============+====================================================+ 35 | uint8_t | magic | The magic bytes for the binary format (0x10FF10AD) | 36 +----------+--------------+----------------------------------------------------+ 37 | uint32_t | version | Version of this format (currently version 1) | 38 +----------+--------------+----------------------------------------------------+ 39 | uint64_t | size | Size of this binary in bytes | 40 +----------+--------------+----------------------------------------------------+ 41 | uint64_t | entry offset | Absolute offset of the offload entries in bytes | 42 +----------+--------------+----------------------------------------------------+ 43 | uint64_t | entry size | Size of the offload entries in bytes | 44 +----------+--------------+----------------------------------------------------+ 45 46Once identified through the magic bytes, we use the size field to take a slice 47of the binary blob containing the information for a single offloading image. We 48can then use the offset field to find the actual offloading entries containing 49the image and metadata. The offload entry contains information about the device 50image. It contains the fields shown in the following 51:ref:`table<table-binary_entry>`. 52 53.. table:: Offloading Entry Table 54 :name: table-binary_entry 55 56 +----------+---------------+----------------------------------------------------+ 57 | Type | Identifier | Description | 58 +==========+===============+====================================================+ 59 | uint16_t | image kind | The kind of the device image (e.g. bc, cubin) | 60 +----------+---------------+----------------------------------------------------+ 61 | uint16_t | offload kind | The producer of the image (e.g. openmp, cuda) | 62 +----------+---------------+----------------------------------------------------+ 63 | uint32_t | flags | Generic flags for the image | 64 +----------+---------------+----------------------------------------------------+ 65 | uint64_t | string offset | Absolute offset of the string metadata table | 66 +----------+---------------+----------------------------------------------------+ 67 | uint64_t | num strings | Number of string entries in the table | 68 +----------+---------------+----------------------------------------------------+ 69 | uint64_t | image offset | Absolute offset of the device image in bytes | 70 +----------+---------------+----------------------------------------------------+ 71 | uint64_t | image size | Size of the device image in bytes | 72 +----------+---------------+----------------------------------------------------+ 73 74This table contains the offsets of the string table and the device image itself 75along with some other integer information. The image kind lets us easily 76identify the type of image stored here without needing to inspect the binary. 77The offloading kind is used to determine which registration code or linking 78semantics are necessary for this image. These are stored as enumerations with 79the following values for the :ref:`offload kind<table-offload_kind>` and the 80:ref:`image kind<table-image_kind>`. 81 82.. table:: Image Kind 83 :name: table-image_kind 84 85 +---------------+-------+---------------------------------------+ 86 | Name | Value | Description | 87 +===============+=======+=======================================+ 88 | IMG_None | 0x00 | No image information provided | 89 +---------------+-------+---------------------------------------+ 90 | IMG_Object | 0x01 | The image is a generic object file | 91 +---------------+-------+---------------------------------------+ 92 | IMG_Bitcode | 0x02 | The image is an LLVM-IR bitcode file | 93 +---------------+-------+---------------------------------------+ 94 | IMG_Cubin | 0x03 | The image is a CUDA object file | 95 +---------------+-------+---------------------------------------+ 96 | IMG_Fatbinary | 0x04 | The image is a CUDA fatbinary file | 97 +---------------+-------+---------------------------------------+ 98 | IMG_PTX | 0x05 | The image is a CUDA PTX file | 99 +---------------+-------+---------------------------------------+ 100 101.. table:: Offload Kind 102 :name: table-offload_kind 103 104 +------------+-------+---------------------------------------+ 105 | Name | Value | Description | 106 +============+=======+=======================================+ 107 | OFK_None | 0x00 | No offloading information provided | 108 +------------+-------+---------------------------------------+ 109 | OFK_OpenMP | 0x01 | The producer was OpenMP offloading | 110 +------------+-------+---------------------------------------+ 111 | OFK_CUDA | 0x02 | The producer was CUDA | 112 +------------+-------+---------------------------------------+ 113 | OFK_HIP | 0x03 | The producer was HIP | 114 +------------+-------+---------------------------------------+ 115 116The flags are used to signify certain conditions, such as the presence of 117debugging information or whether or not LTO was used. The string entry table is 118used to generically contain any arbitrary key-value pair. This is stored as an 119array of the :ref:`string entry<table-binary_string>` format. 120 121.. table:: Offloading String Entry 122 :name: table-binary_string 123 124 +----------+--------------+-------------------------------------------------------+ 125 | Type | Identifier | Description | 126 +==========+==============+=======================================================+ 127 | uint64_t | key offset | Absolute byte offset of the key in th string table | 128 +----------+--------------+-------------------------------------------------------+ 129 | uint64_t | value offset | Absolute byte offset of the value in the string table | 130 +----------+--------------+-------------------------------------------------------+ 131 132The string entries simply provide offsets to a key and value pair in the 133binary images string table. The string table is simply a collection of null 134terminated strings with defined offsets in the image. The string entry allows us 135to create a key-value pair from this string table. This is used for passing 136arbitrary arguments to the image, such as the triple and architecture. 137 138All of these structures are combined to form a single binary blob, the order 139does not matter because of the use of absolute offsets. This makes it easier to 140extend in the future. As mentioned previously, multiple offloading images are 141bundled together by simply concatenating them in this format. Because we have 142the magic bytes and size of each image, we can extract them as-needed. 143 144Usage 145===== 146 147This tool can be used with the following arguments. Generally information is 148passed as a key-value pair to the ``image=`` argument. The ``file`` and ``triple``, 149arguments are considered mandatory to make a valid image. The ``arch`` argument 150is suggested. 151 152.. code-block:: console 153 154 OVERVIEW: A utility for bundling several object files into a single binary. 155 The output binary can then be embedded into the host section table 156 to create a fatbinary containing offloading code. 157 158 USAGE: clang-offload-packager [options] 159 160 OPTIONS: 161 162 Generic Options: 163 164 --help - Display available options (--help-hidden for more) 165 --help-list - Display list of available options (--help-list-hidden for more) 166 --version - Display the version of this program 167 168 clang-offload-packager options: 169 170 --image=<<key>=<value>,...> - List of key and value arguments. Required 171 keywords are 'file' and 'triple'. 172 -o <file> - Write output to <file>. 173 174Example 175======= 176 177This tool simply takes many input files from the ``image`` option and creates a 178single output file with all the images combined. 179 180.. code-block:: console 181 182 clang-offload-packager -o out.bin --image=file=input.o,triple=nvptx64,arch=sm_70 183 184The inverse operation can be performed instead by passing the packaged binary as 185input. In this mode the matching images will either be placed in the output 186specified by the ``file`` option. If no ``file`` argument is provided a name 187will be generated for each matching image. 188 189.. code-block:: console 190 191 clang-offload-packager in.bin --image=file=output.o,triple=nvptx64,arch=sm_70 192