1=====================
2BPF Type Format (BTF)
3=====================
4
51. Introduction
6***************
7
8BTF (BPF Type Format) is the metadata format which encodes the debug info
9related to BPF program/map. The name BTF was used initially to describe data
10types. The BTF was later extended to include function info for defined
11subroutines, and line info for source/line information.
12
13The debug info is used for map pretty print, function signature, etc. The
14function signature enables better bpf program/function kernel symbol. The line
15info helps generate source annotated translated byte code, jited code and
16verifier log.
17
18The BTF specification contains two parts,
19  * BTF kernel API
20  * BTF ELF file format
21
22The kernel API is the contract between user space and kernel. The kernel
23verifies the BTF info before using it. The ELF file format is a user space
24contract between ELF file and libbpf loader.
25
26The type and string sections are part of the BTF kernel API, describing the
27debug info (mostly types related) referenced by the bpf program. These two
28sections are discussed in details in :ref:`BTF_Type_String`.
29
30.. _BTF_Type_String:
31
322. BTF Type and String Encoding
33*******************************
34
35The file ``include/uapi/linux/btf.h`` provides high-level definition of how
36types/strings are encoded.
37
38The beginning of data blob must be::
39
40    struct btf_header {
41        __u16   magic;
42        __u8    version;
43        __u8    flags;
44        __u32   hdr_len;
45
46        /* All offsets are in bytes relative to the end of this header */
47        __u32   type_off;       /* offset of type section       */
48        __u32   type_len;       /* length of type section       */
49        __u32   str_off;        /* offset of string section     */
50        __u32   str_len;        /* length of string section     */
51    };
52
53The magic is ``0xeB9F``, which has different encoding for big and little
54endian systems, and can be used to test whether BTF is generated for big- or
55little-endian target. The ``btf_header`` is designed to be extensible with
56``hdr_len`` equal to ``sizeof(struct btf_header)`` when a data blob is
57generated.
58
592.1 String Encoding
60===================
61
62The first string in the string section must be a null string. The rest of
63string table is a concatenation of other null-terminated strings.
64
652.2 Type Encoding
66=================
67
68The type id ``0`` is reserved for ``void`` type. The type section is parsed
69sequentially and type id is assigned to each recognized type starting from id
70``1``. Currently, the following types are supported::
71
72    #define BTF_KIND_INT            1       /* Integer      */
73    #define BTF_KIND_PTR            2       /* Pointer      */
74    #define BTF_KIND_ARRAY          3       /* Array        */
75    #define BTF_KIND_STRUCT         4       /* Struct       */
76    #define BTF_KIND_UNION          5       /* Union        */
77    #define BTF_KIND_ENUM           6       /* Enumeration  */
78    #define BTF_KIND_FWD            7       /* Forward      */
79    #define BTF_KIND_TYPEDEF        8       /* Typedef      */
80    #define BTF_KIND_VOLATILE       9       /* Volatile     */
81    #define BTF_KIND_CONST          10      /* Const        */
82    #define BTF_KIND_RESTRICT       11      /* Restrict     */
83    #define BTF_KIND_FUNC           12      /* Function     */
84    #define BTF_KIND_FUNC_PROTO     13      /* Function Proto       */
85    #define BTF_KIND_VAR            14      /* Variable     */
86    #define BTF_KIND_DATASEC        15      /* Section      */
87    #define BTF_KIND_FLOAT          16      /* Floating point       */
88
89Note that the type section encodes debug info, not just pure types.
90``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram.
91
92Each type contains the following common data::
93
94    struct btf_type {
95        __u32 name_off;
96        /* "info" bits arrangement
97         * bits  0-15: vlen (e.g. # of struct's members)
98         * bits 16-23: unused
99         * bits 24-28: kind (e.g. int, ptr, array...etc)
100         * bits 29-30: unused
101         * bit     31: kind_flag, currently used by
102         *             struct, union and fwd
103         */
104        __u32 info;
105        /* "size" is used by INT, ENUM, STRUCT and UNION.
106         * "size" tells the size of the type it is describing.
107         *
108         * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
109         * FUNC and FUNC_PROTO.
110         * "type" is a type_id referring to another type.
111         */
112        union {
113                __u32 size;
114                __u32 type;
115        };
116    };
117
118For certain kinds, the common data are followed by kind-specific data. The
119``name_off`` in ``struct btf_type`` specifies the offset in the string table.
120The following sections detail encoding of each kind.
121
1222.2.1 BTF_KIND_INT
123~~~~~~~~~~~~~~~~~~
124
125``struct btf_type`` encoding requirement:
126 * ``name_off``: any valid offset
127 * ``info.kind_flag``: 0
128 * ``info.kind``: BTF_KIND_INT
129 * ``info.vlen``: 0
130 * ``size``: the size of the int type in bytes.
131
132``btf_type`` is followed by a ``u32`` with the following bits arrangement::
133
134  #define BTF_INT_ENCODING(VAL)   (((VAL) & 0x0f000000) >> 24)
135  #define BTF_INT_OFFSET(VAL)     (((VAL) & 0x00ff0000) >> 16)
136  #define BTF_INT_BITS(VAL)       ((VAL)  & 0x000000ff)
137
138The ``BTF_INT_ENCODING`` has the following attributes::
139
140  #define BTF_INT_SIGNED  (1 << 0)
141  #define BTF_INT_CHAR    (1 << 1)
142  #define BTF_INT_BOOL    (1 << 2)
143
144The ``BTF_INT_ENCODING()`` provides extra information: signedness, char, or
145bool, for the int type. The char and bool encoding are mostly useful for
146pretty print. At most one encoding can be specified for the int type.
147
148The ``BTF_INT_BITS()`` specifies the number of actual bits held by this int
149type. For example, a 4-bit bitfield encodes ``BTF_INT_BITS()`` equals to 4.
150The ``btf_type.size * 8`` must be equal to or greater than ``BTF_INT_BITS()``
151for the type. The maximum value of ``BTF_INT_BITS()`` is 128.
152
153The ``BTF_INT_OFFSET()`` specifies the starting bit offset to calculate values
154for this int. For example, a bitfield struct member has:
155
156 * btf member bit offset 100 from the start of the structure,
157 * btf member pointing to an int type,
158 * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
159
160Then in the struct memory layout, this member will occupy ``4`` bits starting
161from bits ``100 + 2 = 102``.
162
163Alternatively, the bitfield struct member can be the following to access the
164same bits as the above:
165
166 * btf member bit offset 102,
167 * btf member pointing to an int type,
168 * the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4``
169
170The original intention of ``BTF_INT_OFFSET()`` is to provide flexibility of
171bitfield encoding. Currently, both llvm and pahole generate
172``BTF_INT_OFFSET() = 0`` for all int types.
173
1742.2.2 BTF_KIND_PTR
175~~~~~~~~~~~~~~~~~~
176
177``struct btf_type`` encoding requirement:
178  * ``name_off``: 0
179  * ``info.kind_flag``: 0
180  * ``info.kind``: BTF_KIND_PTR
181  * ``info.vlen``: 0
182  * ``type``: the pointee type of the pointer
183
184No additional type data follow ``btf_type``.
185
1862.2.3 BTF_KIND_ARRAY
187~~~~~~~~~~~~~~~~~~~~
188
189``struct btf_type`` encoding requirement:
190  * ``name_off``: 0
191  * ``info.kind_flag``: 0
192  * ``info.kind``: BTF_KIND_ARRAY
193  * ``info.vlen``: 0
194  * ``size/type``: 0, not used
195
196``btf_type`` is followed by one ``struct btf_array``::
197
198    struct btf_array {
199        __u32   type;
200        __u32   index_type;
201        __u32   nelems;
202    };
203
204The ``struct btf_array`` encoding:
205  * ``type``: the element type
206  * ``index_type``: the index type
207  * ``nelems``: the number of elements for this array (``0`` is also allowed).
208
209The ``index_type`` can be any regular int type (``u8``, ``u16``, ``u32``,
210``u64``, ``unsigned __int128``). The original design of including
211``index_type`` follows DWARF, which has an ``index_type`` for its array type.
212Currently in BTF, beyond type verification, the ``index_type`` is not used.
213
214The ``struct btf_array`` allows chaining through element type to represent
215multidimensional arrays. For example, for ``int a[5][6]``, the following type
216information illustrates the chaining:
217
218  * [1]: int
219  * [2]: array, ``btf_array.type = [1]``, ``btf_array.nelems = 6``
220  * [3]: array, ``btf_array.type = [2]``, ``btf_array.nelems = 5``
221
222Currently, both pahole and llvm collapse multidimensional array into
223one-dimensional array, e.g., for ``a[5][6]``, the ``btf_array.nelems`` is
224equal to ``30``. This is because the original use case is map pretty print
225where the whole array is dumped out so one-dimensional array is enough. As
226more BTF usage is explored, pahole and llvm can be changed to generate proper
227chained representation for multidimensional arrays.
228
2292.2.4 BTF_KIND_STRUCT
230~~~~~~~~~~~~~~~~~~~~~
2312.2.5 BTF_KIND_UNION
232~~~~~~~~~~~~~~~~~~~~
233
234``struct btf_type`` encoding requirement:
235  * ``name_off``: 0 or offset to a valid C identifier
236  * ``info.kind_flag``: 0 or 1
237  * ``info.kind``: BTF_KIND_STRUCT or BTF_KIND_UNION
238  * ``info.vlen``: the number of struct/union members
239  * ``info.size``: the size of the struct/union in bytes
240
241``btf_type`` is followed by ``info.vlen`` number of ``struct btf_member``.::
242
243    struct btf_member {
244        __u32   name_off;
245        __u32   type;
246        __u32   offset;
247    };
248
249``struct btf_member`` encoding:
250  * ``name_off``: offset to a valid C identifier
251  * ``type``: the member type
252  * ``offset``: <see below>
253
254If the type info ``kind_flag`` is not set, the offset contains only bit offset
255of the member. Note that the base type of the bitfield can only be int or enum
256type. If the bitfield size is 32, the base type can be either int or enum
257type. If the bitfield size is not 32, the base type must be int, and int type
258``BTF_INT_BITS()`` encodes the bitfield size.
259
260If the ``kind_flag`` is set, the ``btf_member.offset`` contains both member
261bitfield size and bit offset. The bitfield size and bit offset are calculated
262as below.::
263
264  #define BTF_MEMBER_BITFIELD_SIZE(val)   ((val) >> 24)
265  #define BTF_MEMBER_BIT_OFFSET(val)      ((val) & 0xffffff)
266
267In this case, if the base type is an int type, it must be a regular int type:
268
269  * ``BTF_INT_OFFSET()`` must be 0.
270  * ``BTF_INT_BITS()`` must be equal to ``{1,2,4,8,16} * 8``.
271
272The following kernel patch introduced ``kind_flag`` and explained why both
273modes exist:
274
275  https://github.com/torvalds/linux/commit/9d5f9f701b1891466fb3dbb1806ad97716f95cc3#diff-fa650a64fdd3968396883d2fe8215ff3
276
2772.2.6 BTF_KIND_ENUM
278~~~~~~~~~~~~~~~~~~~
279
280``struct btf_type`` encoding requirement:
281  * ``name_off``: 0 or offset to a valid C identifier
282  * ``info.kind_flag``: 0
283  * ``info.kind``: BTF_KIND_ENUM
284  * ``info.vlen``: number of enum values
285  * ``size``: 4
286
287``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.::
288
289    struct btf_enum {
290        __u32   name_off;
291        __s32   val;
292    };
293
294The ``btf_enum`` encoding:
295  * ``name_off``: offset to a valid C identifier
296  * ``val``: any value
297
2982.2.7 BTF_KIND_FWD
299~~~~~~~~~~~~~~~~~~
300
301``struct btf_type`` encoding requirement:
302  * ``name_off``: offset to a valid C identifier
303  * ``info.kind_flag``: 0 for struct, 1 for union
304  * ``info.kind``: BTF_KIND_FWD
305  * ``info.vlen``: 0
306  * ``type``: 0
307
308No additional type data follow ``btf_type``.
309
3102.2.8 BTF_KIND_TYPEDEF
311~~~~~~~~~~~~~~~~~~~~~~
312
313``struct btf_type`` encoding requirement:
314  * ``name_off``: offset to a valid C identifier
315  * ``info.kind_flag``: 0
316  * ``info.kind``: BTF_KIND_TYPEDEF
317  * ``info.vlen``: 0
318  * ``type``: the type which can be referred by name at ``name_off``
319
320No additional type data follow ``btf_type``.
321
3222.2.9 BTF_KIND_VOLATILE
323~~~~~~~~~~~~~~~~~~~~~~~
324
325``struct btf_type`` encoding requirement:
326  * ``name_off``: 0
327  * ``info.kind_flag``: 0
328  * ``info.kind``: BTF_KIND_VOLATILE
329  * ``info.vlen``: 0
330  * ``type``: the type with ``volatile`` qualifier
331
332No additional type data follow ``btf_type``.
333
3342.2.10 BTF_KIND_CONST
335~~~~~~~~~~~~~~~~~~~~~
336
337``struct btf_type`` encoding requirement:
338  * ``name_off``: 0
339  * ``info.kind_flag``: 0
340  * ``info.kind``: BTF_KIND_CONST
341  * ``info.vlen``: 0
342  * ``type``: the type with ``const`` qualifier
343
344No additional type data follow ``btf_type``.
345
3462.2.11 BTF_KIND_RESTRICT
347~~~~~~~~~~~~~~~~~~~~~~~~
348
349``struct btf_type`` encoding requirement:
350  * ``name_off``: 0
351  * ``info.kind_flag``: 0
352  * ``info.kind``: BTF_KIND_RESTRICT
353  * ``info.vlen``: 0
354  * ``type``: the type with ``restrict`` qualifier
355
356No additional type data follow ``btf_type``.
357
3582.2.12 BTF_KIND_FUNC
359~~~~~~~~~~~~~~~~~~~~
360
361``struct btf_type`` encoding requirement:
362  * ``name_off``: offset to a valid C identifier
363  * ``info.kind_flag``: 0
364  * ``info.kind``: BTF_KIND_FUNC
365  * ``info.vlen``: 0
366  * ``type``: a BTF_KIND_FUNC_PROTO type
367
368No additional type data follow ``btf_type``.
369
370A BTF_KIND_FUNC defines not a type, but a subprogram (function) whose
371signature is defined by ``type``. The subprogram is thus an instance of that
372type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the
373:ref:`BTF_Ext_Section` (ELF) or in the arguments to :ref:`BPF_Prog_Load`
374(ABI).
375
3762.2.13 BTF_KIND_FUNC_PROTO
377~~~~~~~~~~~~~~~~~~~~~~~~~~
378
379``struct btf_type`` encoding requirement:
380  * ``name_off``: 0
381  * ``info.kind_flag``: 0
382  * ``info.kind``: BTF_KIND_FUNC_PROTO
383  * ``info.vlen``: # of parameters
384  * ``type``: the return type
385
386``btf_type`` is followed by ``info.vlen`` number of ``struct btf_param``.::
387
388    struct btf_param {
389        __u32   name_off;
390        __u32   type;
391    };
392
393If a BTF_KIND_FUNC_PROTO type is referred by a BTF_KIND_FUNC type, then
394``btf_param.name_off`` must point to a valid C identifier except for the
395possible last argument representing the variable argument. The btf_param.type
396refers to parameter type.
397
398If the function has variable arguments, the last parameter is encoded with
399``name_off = 0`` and ``type = 0``.
400
4012.2.14 BTF_KIND_VAR
402~~~~~~~~~~~~~~~~~~~
403
404``struct btf_type`` encoding requirement:
405  * ``name_off``: offset to a valid C identifier
406  * ``info.kind_flag``: 0
407  * ``info.kind``: BTF_KIND_VAR
408  * ``info.vlen``: 0
409  * ``type``: the type of the variable
410
411``btf_type`` is followed by a single ``struct btf_variable`` with the
412following data::
413
414    struct btf_var {
415        __u32   linkage;
416    };
417
418``struct btf_var`` encoding:
419  * ``linkage``: currently only static variable 0, or globally allocated
420                 variable in ELF sections 1
421
422Not all type of global variables are supported by LLVM at this point.
423The following is currently available:
424
425  * static variables with or without section attributes
426  * global variables with section attributes
427
428The latter is for future extraction of map key/value type id's from a
429map definition.
430
4312.2.15 BTF_KIND_DATASEC
432~~~~~~~~~~~~~~~~~~~~~~~
433
434``struct btf_type`` encoding requirement:
435  * ``name_off``: offset to a valid name associated with a variable or
436                  one of .data/.bss/.rodata
437  * ``info.kind_flag``: 0
438  * ``info.kind``: BTF_KIND_DATASEC
439  * ``info.vlen``: # of variables
440  * ``size``: total section size in bytes (0 at compilation time, patched
441              to actual size by BPF loaders such as libbpf)
442
443``btf_type`` is followed by ``info.vlen`` number of ``struct btf_var_secinfo``.::
444
445    struct btf_var_secinfo {
446        __u32   type;
447        __u32   offset;
448        __u32   size;
449    };
450
451``struct btf_var_secinfo`` encoding:
452  * ``type``: the type of the BTF_KIND_VAR variable
453  * ``offset``: the in-section offset of the variable
454  * ``size``: the size of the variable in bytes
455
4562.2.16 BTF_KIND_FLOAT
457~~~~~~~~~~~~~~~~~~~~~
458
459``struct btf_type`` encoding requirement:
460 * ``name_off``: any valid offset
461 * ``info.kind_flag``: 0
462 * ``info.kind``: BTF_KIND_FLOAT
463 * ``info.vlen``: 0
464 * ``size``: the size of the float type in bytes: 2, 4, 8, 12 or 16.
465
466No additional type data follow ``btf_type``.
467
4683. BTF Kernel API
469*****************
470
471The following bpf syscall command involves BTF:
472   * BPF_BTF_LOAD: load a blob of BTF data into kernel
473   * BPF_MAP_CREATE: map creation with btf key and value type info.
474   * BPF_PROG_LOAD: prog load with btf function and line info.
475   * BPF_BTF_GET_FD_BY_ID: get a btf fd
476   * BPF_OBJ_GET_INFO_BY_FD: btf, func_info, line_info
477     and other btf related info are returned.
478
479The workflow typically looks like:
480::
481
482  Application:
483      BPF_BTF_LOAD
484          |
485          v
486      BPF_MAP_CREATE and BPF_PROG_LOAD
487          |
488          V
489      ......
490
491  Introspection tool:
492      ......
493      BPF_{PROG,MAP}_GET_NEXT_ID (get prog/map id's)
494          |
495          V
496      BPF_{PROG,MAP}_GET_FD_BY_ID (get a prog/map fd)
497          |
498          V
499      BPF_OBJ_GET_INFO_BY_FD (get bpf_prog_info/bpf_map_info with btf_id)
500          |                                     |
501          V                                     |
502      BPF_BTF_GET_FD_BY_ID (get btf_fd)         |
503          |                                     |
504          V                                     |
505      BPF_OBJ_GET_INFO_BY_FD (get btf)          |
506          |                                     |
507          V                                     V
508      pretty print types, dump func signatures and line info, etc.
509
510
5113.1 BPF_BTF_LOAD
512================
513
514Load a blob of BTF data into kernel. A blob of data, described in
515:ref:`BTF_Type_String`, can be directly loaded into the kernel. A ``btf_fd``
516is returned to a userspace.
517
5183.2 BPF_MAP_CREATE
519==================
520
521A map can be created with ``btf_fd`` and specified key/value type id.::
522
523    __u32   btf_fd;         /* fd pointing to a BTF type data */
524    __u32   btf_key_type_id;        /* BTF type_id of the key */
525    __u32   btf_value_type_id;      /* BTF type_id of the value */
526
527In libbpf, the map can be defined with extra annotation like below:
528::
529
530    struct bpf_map_def SEC("maps") btf_map = {
531        .type = BPF_MAP_TYPE_ARRAY,
532        .key_size = sizeof(int),
533        .value_size = sizeof(struct ipv_counts),
534        .max_entries = 4,
535    };
536    BPF_ANNOTATE_KV_PAIR(btf_map, int, struct ipv_counts);
537
538Here, the parameters for macro BPF_ANNOTATE_KV_PAIR are map name, key and
539value types for the map. During ELF parsing, libbpf is able to extract
540key/value type_id's and assign them to BPF_MAP_CREATE attributes
541automatically.
542
543.. _BPF_Prog_Load:
544
5453.3 BPF_PROG_LOAD
546=================
547
548During prog_load, func_info and line_info can be passed to kernel with proper
549values for the following attributes:
550::
551
552    __u32           insn_cnt;
553    __aligned_u64   insns;
554    ......
555    __u32           prog_btf_fd;    /* fd pointing to BTF type data */
556    __u32           func_info_rec_size;     /* userspace bpf_func_info size */
557    __aligned_u64   func_info;      /* func info */
558    __u32           func_info_cnt;  /* number of bpf_func_info records */
559    __u32           line_info_rec_size;     /* userspace bpf_line_info size */
560    __aligned_u64   line_info;      /* line info */
561    __u32           line_info_cnt;  /* number of bpf_line_info records */
562
563The func_info and line_info are an array of below, respectively.::
564
565    struct bpf_func_info {
566        __u32   insn_off; /* [0, insn_cnt - 1] */
567        __u32   type_id;  /* pointing to a BTF_KIND_FUNC type */
568    };
569    struct bpf_line_info {
570        __u32   insn_off; /* [0, insn_cnt - 1] */
571        __u32   file_name_off; /* offset to string table for the filename */
572        __u32   line_off; /* offset to string table for the source line */
573        __u32   line_col; /* line number and column number */
574    };
575
576func_info_rec_size is the size of each func_info record, and
577line_info_rec_size is the size of each line_info record. Passing the record
578size to kernel make it possible to extend the record itself in the future.
579
580Below are requirements for func_info:
581  * func_info[0].insn_off must be 0.
582  * the func_info insn_off is in strictly increasing order and matches
583    bpf func boundaries.
584
585Below are requirements for line_info:
586  * the first insn in each func must have a line_info record pointing to it.
587  * the line_info insn_off is in strictly increasing order.
588
589For line_info, the line number and column number are defined as below:
590::
591
592    #define BPF_LINE_INFO_LINE_NUM(line_col)        ((line_col) >> 10)
593    #define BPF_LINE_INFO_LINE_COL(line_col)        ((line_col) & 0x3ff)
594
5953.4 BPF_{PROG,MAP}_GET_NEXT_ID
596==============================
597
598In kernel, every loaded program, map or btf has a unique id. The id won't
599change during the lifetime of a program, map, or btf.
600
601The bpf syscall command BPF_{PROG,MAP}_GET_NEXT_ID returns all id's, one for
602each command, to user space, for bpf program or maps, respectively, so an
603inspection tool can inspect all programs and maps.
604
6053.5 BPF_{PROG,MAP}_GET_FD_BY_ID
606===============================
607
608An introspection tool cannot use id to get details about program or maps.
609A file descriptor needs to be obtained first for reference-counting purpose.
610
6113.6 BPF_OBJ_GET_INFO_BY_FD
612==========================
613
614Once a program/map fd is acquired, an introspection tool can get the detailed
615information from kernel about this fd, some of which are BTF-related. For
616example, ``bpf_map_info`` returns ``btf_id`` and key/value type ids.
617``bpf_prog_info`` returns ``btf_id``, func_info, and line info for translated
618bpf byte codes, and jited_line_info.
619
6203.7 BPF_BTF_GET_FD_BY_ID
621========================
622
623With ``btf_id`` obtained in ``bpf_map_info`` and ``bpf_prog_info``, bpf
624syscall command BPF_BTF_GET_FD_BY_ID can retrieve a btf fd. Then, with
625command BPF_OBJ_GET_INFO_BY_FD, the btf blob, originally loaded into the
626kernel with BPF_BTF_LOAD, can be retrieved.
627
628With the btf blob, ``bpf_map_info``, and ``bpf_prog_info``, an introspection
629tool has full btf knowledge and is able to pretty print map key/values, dump
630func signatures and line info, along with byte/jit codes.
631
6324. ELF File Format Interface
633****************************
634
6354.1 .BTF section
636================
637
638The .BTF section contains type and string data. The format of this section is
639same as the one describe in :ref:`BTF_Type_String`.
640
641.. _BTF_Ext_Section:
642
6434.2 .BTF.ext section
644====================
645
646The .BTF.ext section encodes func_info and line_info which needs loader
647manipulation before loading into the kernel.
648
649The specification for .BTF.ext section is defined at ``tools/lib/bpf/btf.h``
650and ``tools/lib/bpf/btf.c``.
651
652The current header of .BTF.ext section::
653
654    struct btf_ext_header {
655        __u16   magic;
656        __u8    version;
657        __u8    flags;
658        __u32   hdr_len;
659
660        /* All offsets are in bytes relative to the end of this header */
661        __u32   func_info_off;
662        __u32   func_info_len;
663        __u32   line_info_off;
664        __u32   line_info_len;
665    };
666
667It is very similar to .BTF section. Instead of type/string section, it
668contains func_info and line_info section. See :ref:`BPF_Prog_Load` for details
669about func_info and line_info record format.
670
671The func_info is organized as below.::
672
673     func_info_rec_size
674     btf_ext_info_sec for section #1 /* func_info for section #1 */
675     btf_ext_info_sec for section #2 /* func_info for section #2 */
676     ...
677
678``func_info_rec_size`` specifies the size of ``bpf_func_info`` structure when
679.BTF.ext is generated. ``btf_ext_info_sec``, defined below, is a collection of
680func_info for each specific ELF section.::
681
682     struct btf_ext_info_sec {
683        __u32   sec_name_off; /* offset to section name */
684        __u32   num_info;
685        /* Followed by num_info * record_size number of bytes */
686        __u8    data[0];
687     };
688
689Here, num_info must be greater than 0.
690
691The line_info is organized as below.::
692
693     line_info_rec_size
694     btf_ext_info_sec for section #1 /* line_info for section #1 */
695     btf_ext_info_sec for section #2 /* line_info for section #2 */
696     ...
697
698``line_info_rec_size`` specifies the size of ``bpf_line_info`` structure when
699.BTF.ext is generated.
700
701The interpretation of ``bpf_func_info->insn_off`` and
702``bpf_line_info->insn_off`` is different between kernel API and ELF API. For
703kernel API, the ``insn_off`` is the instruction offset in the unit of ``struct
704bpf_insn``. For ELF API, the ``insn_off`` is the byte offset from the
705beginning of section (``btf_ext_info_sec->sec_name_off``).
706
7074.2 .BTF_ids section
708====================
709
710The .BTF_ids section encodes BTF ID values that are used within the kernel.
711
712This section is created during the kernel compilation with the help of
713macros defined in ``include/linux/btf_ids.h`` header file. Kernel code can
714use them to create lists and sets (sorted lists) of BTF ID values.
715
716The ``BTF_ID_LIST`` and ``BTF_ID`` macros define unsorted list of BTF ID values,
717with following syntax::
718
719  BTF_ID_LIST(list)
720  BTF_ID(type1, name1)
721  BTF_ID(type2, name2)
722
723resulting in following layout in .BTF_ids section::
724
725  __BTF_ID__type1__name1__1:
726  .zero 4
727  __BTF_ID__type2__name2__2:
728  .zero 4
729
730The ``u32 list[];`` variable is defined to access the list.
731
732The ``BTF_ID_UNUSED`` macro defines 4 zero bytes. It's used when we
733want to define unused entry in BTF_ID_LIST, like::
734
735      BTF_ID_LIST(bpf_skb_output_btf_ids)
736      BTF_ID(struct, sk_buff)
737      BTF_ID_UNUSED
738      BTF_ID(struct, task_struct)
739
740The ``BTF_SET_START/END`` macros pair defines sorted list of BTF ID values
741and their count, with following syntax::
742
743  BTF_SET_START(set)
744  BTF_ID(type1, name1)
745  BTF_ID(type2, name2)
746  BTF_SET_END(set)
747
748resulting in following layout in .BTF_ids section::
749
750  __BTF_ID__set__set:
751  .zero 4
752  __BTF_ID__type1__name1__3:
753  .zero 4
754  __BTF_ID__type2__name2__4:
755  .zero 4
756
757The ``struct btf_id_set set;`` variable is defined to access the list.
758
759The ``typeX`` name can be one of following::
760
761   struct, union, typedef, func
762
763and is used as a filter when resolving the BTF ID value.
764
765All the BTF ID lists and sets are compiled in the .BTF_ids section and
766resolved during the linking phase of kernel build by ``resolve_btfids`` tool.
767
7685. Using BTF
769************
770
7715.1 bpftool map pretty print
772============================
773
774With BTF, the map key/value can be printed based on fields rather than simply
775raw bytes. This is especially valuable for large structure or if your data
776structure has bitfields. For example, for the following map,::
777
778      enum A { A1, A2, A3, A4, A5 };
779      typedef enum A ___A;
780      struct tmp_t {
781           char a1:4;
782           int  a2:4;
783           int  :4;
784           __u32 a3:4;
785           int b;
786           ___A b1:4;
787           enum A b2:4;
788      };
789      struct bpf_map_def SEC("maps") tmpmap = {
790           .type = BPF_MAP_TYPE_ARRAY,
791           .key_size = sizeof(__u32),
792           .value_size = sizeof(struct tmp_t),
793           .max_entries = 1,
794      };
795      BPF_ANNOTATE_KV_PAIR(tmpmap, int, struct tmp_t);
796
797bpftool is able to pretty print like below:
798::
799
800      [{
801            "key": 0,
802            "value": {
803                "a1": 0x2,
804                "a2": 0x4,
805                "a3": 0x6,
806                "b": 7,
807                "b1": 0x8,
808                "b2": 0xa
809            }
810        }
811      ]
812
8135.2 bpftool prog dump
814=====================
815
816The following is an example showing how func_info and line_info can help prog
817dump with better kernel symbol names, function prototypes and line
818information.::
819
820    $ bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv
821    [...]
822    int test_long_fname_2(struct dummy_tracepoint_args * arg):
823    bpf_prog_44a040bf25481309_test_long_fname_2:
824    ; static int test_long_fname_2(struct dummy_tracepoint_args *arg)
825       0:   push   %rbp
826       1:   mov    %rsp,%rbp
827       4:   sub    $0x30,%rsp
828       b:   sub    $0x28,%rbp
829       f:   mov    %rbx,0x0(%rbp)
830      13:   mov    %r13,0x8(%rbp)
831      17:   mov    %r14,0x10(%rbp)
832      1b:   mov    %r15,0x18(%rbp)
833      1f:   xor    %eax,%eax
834      21:   mov    %rax,0x20(%rbp)
835      25:   xor    %esi,%esi
836    ; int key = 0;
837      27:   mov    %esi,-0x4(%rbp)
838    ; if (!arg->sock)
839      2a:   mov    0x8(%rdi),%rdi
840    ; if (!arg->sock)
841      2e:   cmp    $0x0,%rdi
842      32:   je     0x0000000000000070
843      34:   mov    %rbp,%rsi
844    ; counts = bpf_map_lookup_elem(&btf_map, &key);
845    [...]
846
8475.3 Verifier Log
848================
849
850The following is an example of how line_info can help debugging verification
851failure.::
852
853       /* The code at tools/testing/selftests/bpf/test_xdp_noinline.c
854        * is modified as below.
855        */
856       data = (void *)(long)xdp->data;
857       data_end = (void *)(long)xdp->data_end;
858       /*
859       if (data + 4 > data_end)
860               return XDP_DROP;
861       */
862       *(u32 *)data = dst->dst;
863
864    $ bpftool prog load ./test_xdp_noinline.o /sys/fs/bpf/test_xdp_noinline type xdp
865        ; data = (void *)(long)xdp->data;
866        224: (79) r2 = *(u64 *)(r10 -112)
867        225: (61) r2 = *(u32 *)(r2 +0)
868        ; *(u32 *)data = dst->dst;
869        226: (63) *(u32 *)(r2 +0) = r1
870        invalid access to packet, off=0 size=4, R2(id=0,off=0,r=0)
871        R2 offset is outside of the packet
872
8736. BTF Generation
874*****************
875
876You need latest pahole
877
878  https://git.kernel.org/pub/scm/devel/pahole/pahole.git/
879
880or llvm (8.0 or later). The pahole acts as a dwarf2btf converter. It doesn't
881support .BTF.ext and btf BTF_KIND_FUNC type yet. For example,::
882
883      -bash-4.4$ cat t.c
884      struct t {
885        int a:2;
886        int b:3;
887        int c:2;
888      } g;
889      -bash-4.4$ gcc -c -O2 -g t.c
890      -bash-4.4$ pahole -JV t.o
891      File t.o:
892      [1] STRUCT t kind_flag=1 size=4 vlen=3
893              a type_id=2 bitfield_size=2 bits_offset=0
894              b type_id=2 bitfield_size=3 bits_offset=2
895              c type_id=2 bitfield_size=2 bits_offset=5
896      [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
897
898The llvm is able to generate .BTF and .BTF.ext directly with -g for bpf target
899only. The assembly code (-S) is able to show the BTF encoding in assembly
900format.::
901
902    -bash-4.4$ cat t2.c
903    typedef int __int32;
904    struct t2 {
905      int a2;
906      int (*f2)(char q1, __int32 q2, ...);
907      int (*f3)();
908    } g2;
909    int main() { return 0; }
910    int test() { return 0; }
911    -bash-4.4$ clang -c -g -O2 -target bpf t2.c
912    -bash-4.4$ readelf -S t2.o
913      ......
914      [ 8] .BTF              PROGBITS         0000000000000000  00000247
915           000000000000016e  0000000000000000           0     0     1
916      [ 9] .BTF.ext          PROGBITS         0000000000000000  000003b5
917           0000000000000060  0000000000000000           0     0     1
918      [10] .rel.BTF.ext      REL              0000000000000000  000007e0
919           0000000000000040  0000000000000010          16     9     8
920      ......
921    -bash-4.4$ clang -S -g -O2 -target bpf t2.c
922    -bash-4.4$ cat t2.s
923      ......
924            .section        .BTF,"",@progbits
925            .short  60319                   # 0xeb9f
926            .byte   1
927            .byte   0
928            .long   24
929            .long   0
930            .long   220
931            .long   220
932            .long   122
933            .long   0                       # BTF_KIND_FUNC_PROTO(id = 1)
934            .long   218103808               # 0xd000000
935            .long   2
936            .long   83                      # BTF_KIND_INT(id = 2)
937            .long   16777216                # 0x1000000
938            .long   4
939            .long   16777248                # 0x1000020
940      ......
941            .byte   0                       # string offset=0
942            .ascii  ".text"                 # string offset=1
943            .byte   0
944            .ascii  "/home/yhs/tmp-pahole/t2.c" # string offset=7
945            .byte   0
946            .ascii  "int main() { return 0; }" # string offset=33
947            .byte   0
948            .ascii  "int test() { return 0; }" # string offset=58
949            .byte   0
950            .ascii  "int"                   # string offset=83
951      ......
952            .section        .BTF.ext,"",@progbits
953            .short  60319                   # 0xeb9f
954            .byte   1
955            .byte   0
956            .long   24
957            .long   0
958            .long   28
959            .long   28
960            .long   44
961            .long   8                       # FuncInfo
962            .long   1                       # FuncInfo section string offset=1
963            .long   2
964            .long   .Lfunc_begin0
965            .long   3
966            .long   .Lfunc_begin1
967            .long   5
968            .long   16                      # LineInfo
969            .long   1                       # LineInfo section string offset=1
970            .long   2
971            .long   .Ltmp0
972            .long   7
973            .long   33
974            .long   7182                    # Line 7 Col 14
975            .long   .Ltmp3
976            .long   7
977            .long   58
978            .long   8206                    # Line 8 Col 14
979
9807. Testing
981**********
982
983Kernel bpf selftest `test_btf.c` provides extensive set of BTF-related tests.
984