1349cc55cSDimitry AndricHierarchical Trace Representation (HTR) 2349cc55cSDimitry Andric====================================== 3349cc55cSDimitry AndricThe humongous amount of data processor traces like the ones obtained with Intel PT contain is not digestible to humans in its raw form. Given this, it is useful to summarize these massive traces by extracting useful information. Hierarchical Trace Representation (HTR) is the way lldb represents a summarized trace internally. HTR efficiently stores trace data and allows the trace data to be transformed in a way akin to compiler passes. 4349cc55cSDimitry Andric 5349cc55cSDimitry AndricConcepts 6349cc55cSDimitry Andric-------- 7349cc55cSDimitry Andric**Block:** One or more contiguous units of the trace. At minimum, the unit of a trace is the load address of an instruction. 8349cc55cSDimitry Andric 9349cc55cSDimitry Andric**Block Metadata:** Metadata associated with each *block*. For processor traces, some metadata examples are the number of instructions in the block or information on what functions are called in the block. 10349cc55cSDimitry Andric 11349cc55cSDimitry Andric**Layer:** The representation of trace data between passes. For Intel PT there are two types of layers: 12349cc55cSDimitry Andric 13349cc55cSDimitry Andric **Instruction Layer:** Composed of the load addresses of the instructions in the trace. In an effort to save space, 14349cc55cSDimitry Andric metadata is only stored for instructions that are of interest, not every instruction in the trace. HTR contains a 15349cc55cSDimitry Andric single instruction layer. 16349cc55cSDimitry Andric 17349cc55cSDimitry Andric **Block Layer:** Composed of blocks - a block in *layer n* refers to a sequence of blocks in *layer n - 1*. A block in 18349cc55cSDimitry Andric *layer 1* refers to a sequence of instructions in *layer 0* (the instruction layer). Metadata is stored for each block in 19349cc55cSDimitry Andric a block layer. HTR contains one or more block layers. 20349cc55cSDimitry Andric 21349cc55cSDimitry Andric**Pass:** A transformation applied to a *layer* that generates a new *layer* that is a more summarized, consolidated representation of the trace data. 22349cc55cSDimitry AndricA pass merges instructions/blocks based on its specific purpose - for example, a pass designed to summarize a processor trace by function calls would merge all the blocks of a function into a single block representing the entire function. 23349cc55cSDimitry Andric 24*5f757f3fSDimitry AndricThe image below illustrates the transformation of a trace's representation (HTR) 25349cc55cSDimitry Andric 26349cc55cSDimitry Andric.. image:: media/htr-example.png 27349cc55cSDimitry Andric 28349cc55cSDimitry Andric 29349cc55cSDimitry AndricPasses 30349cc55cSDimitry Andric------ 31349cc55cSDimitry AndricA *pass* is applied to a *layer* to extract useful information (summarization) and compress the trace representation into a new *layer*. The idea is to have a series of passes where each pass specializes in extracting certain information about the trace. Some examples of potential passes include: identifying functions, identifying loops, or a more general purpose such as identifying long sequences of instructions that are repeated (i.e. Basic Super Block). Below you will find a description of each pass currently implemented in lldb. 32349cc55cSDimitry Andric 33349cc55cSDimitry Andric**Basic Super Block Reduction** 34349cc55cSDimitry Andric 35349cc55cSDimitry AndricA “basic super block” is the longest sequence of blocks that always occur in the same order. (The concept is akin to “Basic Block'' in compiler theory, but refers to dynamic occurrences rather than CFG nodes). 36349cc55cSDimitry Andric 37349cc55cSDimitry AndricThe image below shows the "basic super blocks" of the sequence. Each unique "basic super block" is marked with a different color 38349cc55cSDimitry Andric 39349cc55cSDimitry Andric.. image:: media/basic_super_block_pass.png 40349cc55cSDimitry Andric 41349cc55cSDimitry Andric*Procedure to find all super blocks:* 42349cc55cSDimitry Andric 43349cc55cSDimitry Andric- For each block, compute the number of distinct predecessor and successor blocks. 44349cc55cSDimitry Andric 45349cc55cSDimitry Andric - **Predecessor** - the block that occurs directly before (to the left of) the current block 46349cc55cSDimitry Andric - **Successor** - the block that occurs directly after (to the right of) the current block 47349cc55cSDimitry Andric 48349cc55cSDimitry Andric- A block with more than one distinct successor is always the start of a super block, the super block will continue until the next block with more than one distinct predecessor or successor. 49