1:mod:`uctypes` -- access binary data in a structured way 2======================================================== 3 4.. module:: uctypes 5 :synopsis: access binary data in a structured way 6 7This module implements "foreign data interface" for MicroPython. The idea 8behind it is similar to CPython's ``ctypes`` modules, but the actual API is 9different, streamlined and optimized for small size. The basic idea of the 10module is to define data structure layout with about the same power as the 11C language allows, and then access it using familiar dot-syntax to reference 12sub-fields. 13 14.. warning:: 15 16 ``uctypes`` module allows access to arbitrary memory addresses of the 17 machine (including I/O and control registers). Uncareful usage of it 18 may lead to crashes, data loss, and even hardware malfunction. 19 20.. seealso:: 21 22 Module :mod:`struct` 23 Standard Python way to access binary data structures (doesn't scale 24 well to large and complex structures). 25 26Usage examples:: 27 28 import uctypes 29 30 # Example 1: Subset of ELF file header 31 # https://wikipedia.org/wiki/Executable_and_Linkable_Format#File_header 32 ELF_HEADER = { 33 "EI_MAG": (0x0 | uctypes.ARRAY, 4 | uctypes.UINT8), 34 "EI_DATA": 0x5 | uctypes.UINT8, 35 "e_machine": 0x12 | uctypes.UINT16, 36 } 37 38 # "f" is an ELF file opened in binary mode 39 buf = f.read(uctypes.sizeof(ELF_HEADER, uctypes.LITTLE_ENDIAN)) 40 header = uctypes.struct(uctypes.addressof(buf), ELF_HEADER, uctypes.LITTLE_ENDIAN) 41 assert header.EI_MAG == b"\x7fELF" 42 assert header.EI_DATA == 1, "Oops, wrong endianness. Could retry with uctypes.BIG_ENDIAN." 43 print("machine:", hex(header.e_machine)) 44 45 46 # Example 2: In-memory data structure, with pointers 47 COORD = { 48 "x": 0 | uctypes.FLOAT32, 49 "y": 4 | uctypes.FLOAT32, 50 } 51 52 STRUCT1 = { 53 "data1": 0 | uctypes.UINT8, 54 "data2": 4 | uctypes.UINT32, 55 "ptr": (8 | uctypes.PTR, COORD), 56 } 57 58 # Suppose you have address of a structure of type STRUCT1 in "addr" 59 # uctypes.NATIVE is optional (used by default) 60 struct1 = uctypes.struct(addr, STRUCT1, uctypes.NATIVE) 61 print("x:", struct1.ptr[0].x) 62 63 64 # Example 3: Access to CPU registers. Subset of STM32F4xx WWDG block 65 WWDG_LAYOUT = { 66 "WWDG_CR": (0, { 67 # BFUINT32 here means size of the WWDG_CR register 68 "WDGA": 7 << uctypes.BF_POS | 1 << uctypes.BF_LEN | uctypes.BFUINT32, 69 "T": 0 << uctypes.BF_POS | 7 << uctypes.BF_LEN | uctypes.BFUINT32, 70 }), 71 "WWDG_CFR": (4, { 72 "EWI": 9 << uctypes.BF_POS | 1 << uctypes.BF_LEN | uctypes.BFUINT32, 73 "WDGTB": 7 << uctypes.BF_POS | 2 << uctypes.BF_LEN | uctypes.BFUINT32, 74 "W": 0 << uctypes.BF_POS | 7 << uctypes.BF_LEN | uctypes.BFUINT32, 75 }), 76 } 77 78 WWDG = uctypes.struct(0x40002c00, WWDG_LAYOUT) 79 80 WWDG.WWDG_CFR.WDGTB = 0b10 81 WWDG.WWDG_CR.WDGA = 1 82 print("Current counter:", WWDG.WWDG_CR.T) 83 84Defining structure layout 85------------------------- 86 87Structure layout is defined by a "descriptor" - a Python dictionary which 88encodes field names as keys and other properties required to access them as 89associated values:: 90 91 { 92 "field1": <properties>, 93 "field2": <properties>, 94 ... 95 } 96 97Currently, ``uctypes`` requires explicit specification of offsets for each 98field. Offset are given in bytes from the structure start. 99 100Following are encoding examples for various field types: 101 102* Scalar types:: 103 104 "field_name": offset | uctypes.UINT32 105 106 in other words, the value is a scalar type identifier ORed with a field offset 107 (in bytes) from the start of the structure. 108 109* Recursive structures:: 110 111 "sub": (offset, { 112 "b0": 0 | uctypes.UINT8, 113 "b1": 1 | uctypes.UINT8, 114 }) 115 116 i.e. value is a 2-tuple, first element of which is an offset, and second is 117 a structure descriptor dictionary (note: offsets in recursive descriptors 118 are relative to the structure it defines). Of course, recursive structures 119 can be specified not just by a literal dictionary, but by referring to a 120 structure descriptor dictionary (defined earlier) by name. 121 122* Arrays of primitive types:: 123 124 "arr": (offset | uctypes.ARRAY, size | uctypes.UINT8), 125 126 i.e. value is a 2-tuple, first element of which is ARRAY flag ORed 127 with offset, and second is scalar element type ORed number of elements 128 in the array. 129 130* Arrays of aggregate types:: 131 132 "arr2": (offset | uctypes.ARRAY, size, {"b": 0 | uctypes.UINT8}), 133 134 i.e. value is a 3-tuple, first element of which is ARRAY flag ORed 135 with offset, second is a number of elements in the array, and third is 136 a descriptor of element type. 137 138* Pointer to a primitive type:: 139 140 "ptr": (offset | uctypes.PTR, uctypes.UINT8), 141 142 i.e. value is a 2-tuple, first element of which is PTR flag ORed 143 with offset, and second is a scalar element type. 144 145* Pointer to an aggregate type:: 146 147 "ptr2": (offset | uctypes.PTR, {"b": 0 | uctypes.UINT8}), 148 149 i.e. value is a 2-tuple, first element of which is PTR flag ORed 150 with offset, second is a descriptor of type pointed to. 151 152* Bitfields:: 153 154 "bitf0": offset | uctypes.BFUINT16 | lsbit << uctypes.BF_POS | bitsize << uctypes.BF_LEN, 155 156 i.e. value is a type of scalar value containing given bitfield (typenames are 157 similar to scalar types, but prefixes with ``BF``), ORed with offset for 158 scalar value containing the bitfield, and further ORed with values for 159 bit position and bit length of the bitfield within the scalar value, shifted by 160 BF_POS and BF_LEN bits, respectively. A bitfield position is counted 161 from the least significant bit of the scalar (having position of 0), and 162 is the number of right-most bit of a field (in other words, it's a number 163 of bits a scalar needs to be shifted right to extract the bitfield). 164 165 In the example above, first a UINT16 value will be extracted at offset 0 166 (this detail may be important when accessing hardware registers, where 167 particular access size and alignment are required), and then bitfield 168 whose rightmost bit is *lsbit* bit of this UINT16, and length 169 is *bitsize* bits, will be extracted. For example, if *lsbit* is 0 and 170 *bitsize* is 8, then effectively it will access least-significant byte 171 of UINT16. 172 173 Note that bitfield operations are independent of target byte endianness, 174 in particular, example above will access least-significant byte of UINT16 175 in both little- and big-endian structures. But it depends on the least 176 significant bit being numbered 0. Some targets may use different 177 numbering in their native ABI, but ``uctypes`` always uses the normalized 178 numbering described above. 179 180Module contents 181--------------- 182 183.. class:: struct(addr, descriptor, layout_type=NATIVE, /) 184 185 Instantiate a "foreign data structure" object based on structure address in 186 memory, descriptor (encoded as a dictionary), and layout type (see below). 187 188.. data:: LITTLE_ENDIAN 189 190 Layout type for a little-endian packed structure. (Packed means that every 191 field occupies exactly as many bytes as defined in the descriptor, i.e. 192 the alignment is 1). 193 194.. data:: BIG_ENDIAN 195 196 Layout type for a big-endian packed structure. 197 198.. data:: NATIVE 199 200 Layout type for a native structure - with data endianness and alignment 201 conforming to the ABI of the system on which MicroPython runs. 202 203.. function:: sizeof(struct, layout_type=NATIVE, /) 204 205 Return size of data structure in bytes. The *struct* argument can be 206 either a structure class or a specific instantiated structure object 207 (or its aggregate field). 208 209.. function:: addressof(obj) 210 211 Return address of an object. Argument should be bytes, bytearray or 212 other object supporting buffer protocol (and address of this buffer 213 is what actually returned). 214 215.. function:: bytes_at(addr, size) 216 217 Capture memory at the given address and size as bytes object. As bytes 218 object is immutable, memory is actually duplicated and copied into 219 bytes object, so if memory contents change later, created object 220 retains original value. 221 222.. function:: bytearray_at(addr, size) 223 224 Capture memory at the given address and size as bytearray object. 225 Unlike bytes_at() function above, memory is captured by reference, 226 so it can be both written too, and you will access current value 227 at the given memory address. 228 229.. data:: UINT8 230 INT8 231 UINT16 232 INT16 233 UINT32 234 INT32 235 UINT64 236 INT64 237 238 Integer types for structure descriptors. Constants for 8, 16, 32, 239 and 64 bit types are provided, both signed and unsigned. 240 241.. data:: FLOAT32 242 FLOAT64 243 244 Floating-point types for structure descriptors. 245 246.. data:: VOID 247 248 ``VOID`` is an alias for ``UINT8``, and is provided to conveniently define 249 C's void pointers: ``(uctypes.PTR, uctypes.VOID)``. 250 251.. data:: PTR 252 ARRAY 253 254 Type constants for pointers and arrays. Note that there is no explicit 255 constant for structures, it's implicit: an aggregate type without ``PTR`` 256 or ``ARRAY`` flags is a structure. 257 258Structure descriptors and instantiating structure objects 259--------------------------------------------------------- 260 261Given a structure descriptor dictionary and its layout type, you can 262instantiate a specific structure instance at a given memory address 263using :class:`uctypes.struct()` constructor. Memory address usually comes from 264following sources: 265 266* Predefined address, when accessing hardware registers on a baremetal 267 system. Lookup these addresses in datasheet for a particular MCU/SoC. 268* As a return value from a call to some FFI (Foreign Function Interface) 269 function. 270* From `uctypes.addressof()`, when you want to pass arguments to an FFI 271 function, or alternatively, to access some data for I/O (for example, 272 data read from a file or network socket). 273 274Structure objects 275----------------- 276 277Structure objects allow accessing individual fields using standard dot 278notation: ``my_struct.substruct1.field1``. If a field is of scalar type, 279getting it will produce a primitive value (Python integer or float) 280corresponding to the value contained in a field. A scalar field can also 281be assigned to. 282 283If a field is an array, its individual elements can be accessed with 284the standard subscript operator ``[]`` - both read and assigned to. 285 286If a field is a pointer, it can be dereferenced using ``[0]`` syntax 287(corresponding to C ``*`` operator, though ``[0]`` works in C too). 288Subscripting a pointer with other integer values but 0 are also supported, 289with the same semantics as in C. 290 291Summing up, accessing structure fields generally follows the C syntax, 292except for pointer dereference, when you need to use ``[0]`` operator 293instead of ``*``. 294 295Limitations 296----------- 297 2981. Accessing non-scalar fields leads to allocation of intermediate objects 299to represent them. This means that special care should be taken to 300layout a structure which needs to be accessed when memory allocation 301is disabled (e.g. from an interrupt). The recommendations are: 302 303* Avoid accessing nested structures. For example, instead of 304 ``mcu_registers.peripheral_a.register1``, define separate layout 305 descriptors for each peripheral, to be accessed as 306 ``peripheral_a.register1``. Or just cache a particular peripheral: 307 ``peripheral_a = mcu_registers.peripheral_a``. If a register 308 consists of multiple bitfields, you would need to cache references 309 to a particular register: ``reg_a = mcu_registers.peripheral_a.reg_a``. 310* Avoid other non-scalar data, like arrays. For example, instead of 311 ``peripheral_a.register[0]`` use ``peripheral_a.register0``. Again, 312 an alternative is to cache intermediate values, e.g. 313 ``register0 = peripheral_a.register[0]``. 314 3152. Range of offsets supported by the ``uctypes`` module is limited. 316The exact range supported is considered an implementation detail, 317and the general suggestion is to split structure definitions to 318cover from a few kilobytes to a few dozen of kilobytes maximum. 319In most cases, this is a natural situation anyway, e.g. it doesn't make 320sense to define all registers of an MCU (spread over 32-bit address 321space) in one structure, but rather a peripheral block by peripheral 322block. In some extreme cases, you may need to split a structure in 323several parts artificially (e.g. if accessing native data structure 324with multi-megabyte array in the middle, though that would be a very 325synthetic case). 326