1@c \input texinfo 2@c %**start of header 3@c @setfilename agentexpr.info 4@c @settitle GDB Agent Expressions 5@c @setchapternewpage off 6@c %**end of header 7 8@c Revision: $Id: agentexpr.texi,v 1.3 2004/12/27 14:00:51 kettenis Exp $ 9 10@node Agent Expressions 11@appendix The GDB Agent Expression Mechanism 12 13In some applications, it is not feasable for the debugger to interrupt 14the program's execution long enough for the developer to learn anything 15helpful about its behavior. If the program's correctness depends on its 16real-time behavior, delays introduced by a debugger might cause the 17program to fail, even when the code itself is correct. It is useful to 18be able to observe the program's behavior without interrupting it. 19 20Using GDB's @code{trace} and @code{collect} commands, the user can 21specify locations in the program, and arbitrary expressions to evaluate 22when those locations are reached. Later, using the @code{tfind} 23command, she can examine the values those expressions had when the 24program hit the trace points. The expressions may also denote objects 25in memory --- structures or arrays, for example --- whose values GDB 26should record; while visiting a particular tracepoint, the user may 27inspect those objects as if they were in memory at that moment. 28However, because GDB records these values without interacting with the 29user, it can do so quickly and unobtrusively, hopefully not disturbing 30the program's behavior. 31 32When GDB is debugging a remote target, the GDB @dfn{agent} code running 33on the target computes the values of the expressions itself. To avoid 34having a full symbolic expression evaluator on the agent, GDB translates 35expressions in the source language into a simpler bytecode language, and 36then sends the bytecode to the agent; the agent then executes the 37bytecode, and records the values for GDB to retrieve later. 38 39The bytecode language is simple; there are forty-odd opcodes, the bulk 40of which are the usual vocabulary of C operands (addition, subtraction, 41shifts, and so on) and various sizes of literals and memory reference 42operations. The bytecode interpreter operates strictly on machine-level 43values --- various sizes of integers and floating point numbers --- and 44requires no information about types or symbols; thus, the interpreter's 45internal data structures are simple, and each bytecode requires only a 46few native machine instructions to implement it. The interpreter is 47small, and strict limits on the memory and time required to evaluate an 48expression are easy to determine, making it suitable for use by the 49debugging agent in real-time applications. 50 51@menu 52* General Bytecode Design:: Overview of the interpreter. 53* Bytecode Descriptions:: What each one does. 54* Using Agent Expressions:: How agent expressions fit into the big picture. 55* Varying Target Capabilities:: How to discover what the target can do. 56* Tracing on Symmetrix:: Special info for implementation on EMC's 57 boxes. 58* Rationale:: Why we did it this way. 59@end menu 60 61 62@c @node Rationale 63@c @section Rationale 64 65 66@node General Bytecode Design 67@section General Bytecode Design 68 69The agent represents bytecode expressions as an array of bytes. Each 70instruction is one byte long (thus the term @dfn{bytecode}). Some 71instructions are followed by operand bytes; for example, the @code{goto} 72instruction is followed by a destination for the jump. 73 74The bytecode interpreter is a stack-based machine; most instructions pop 75their operands off the stack, perform some operation, and push the 76result back on the stack for the next instruction to consume. Each 77element of the stack may contain either a integer or a floating point 78value; these values are as many bits wide as the largest integer that 79can be directly manipulated in the source language. Stack elements 80carry no record of their type; bytecode could push a value as an 81integer, then pop it as a floating point value. However, GDB will not 82generate code which does this. In C, one might define the type of a 83stack element as follows: 84@example 85union agent_val @{ 86 LONGEST l; 87 DOUBLEST d; 88@}; 89@end example 90@noindent 91where @code{LONGEST} and @code{DOUBLEST} are @code{typedef} names for 92the largest integer and floating point types on the machine. 93 94By the time the bytecode interpreter reaches the end of the expression, 95the value of the expression should be the only value left on the stack. 96For tracing applications, @code{trace} bytecodes in the expression will 97have recorded the necessary data, and the value on the stack may be 98discarded. For other applications, like conditional breakpoints, the 99value may be useful. 100 101Separate from the stack, the interpreter has two registers: 102@table @code 103@item pc 104The address of the next bytecode to execute. 105 106@item start 107The address of the start of the bytecode expression, necessary for 108interpreting the @code{goto} and @code{if_goto} instructions. 109 110@end table 111@noindent 112Neither of these registers is directly visible to the bytecode language 113itself, but they are useful for defining the meanings of the bytecode 114operations. 115 116There are no instructions to perform side effects on the running 117program, or call the program's functions; we assume that these 118expressions are only used for unobtrusive debugging, not for patching 119the running code. 120 121Most bytecode instructions do not distinguish between the various sizes 122of values, and operate on full-width values; the upper bits of the 123values are simply ignored, since they do not usually make a difference 124to the value computed. The exceptions to this rule are: 125@table @asis 126 127@item memory reference instructions (@code{ref}@var{n}) 128There are distinct instructions to fetch different word sizes from 129memory. Once on the stack, however, the values are treated as full-size 130integers. They may need to be sign-extended; the @code{ext} instruction 131exists for this purpose. 132 133@item the sign-extension instruction (@code{ext} @var{n}) 134These clearly need to know which portion of their operand is to be 135extended to occupy the full length of the word. 136 137@end table 138 139If the interpreter is unable to evaluate an expression completely for 140some reason (a memory location is inaccessible, or a divisor is zero, 141for example), we say that interpretation ``terminates with an error''. 142This means that the problem is reported back to the interpreter's caller 143in some helpful way. In general, code using agent expressions should 144assume that they may attempt to divide by zero, fetch arbitrary memory 145locations, and misbehave in other ways. 146 147Even complicated C expressions compile to a few bytecode instructions; 148for example, the expression @code{x + y * z} would typically produce 149code like the following, assuming that @code{x} and @code{y} live in 150registers, and @code{z} is a global variable holding a 32-bit 151@code{int}: 152@example 153reg 1 154reg 2 155const32 @i{address of z} 156ref32 157ext 32 158mul 159add 160end 161@end example 162 163In detail, these mean: 164@table @code 165 166@item reg 1 167Push the value of register 1 (presumably holding @code{x}) onto the 168stack. 169 170@item reg 2 171Push the value of register 2 (holding @code{y}). 172 173@item const32 @i{address of z} 174Push the address of @code{z} onto the stack. 175 176@item ref32 177Fetch a 32-bit word from the address at the top of the stack; replace 178the address on the stack with the value. Thus, we replace the address 179of @code{z} with @code{z}'s value. 180 181@item ext 32 182Sign-extend the value on the top of the stack from 32 bits to full 183length. This is necessary because @code{z} is a signed integer. 184 185@item mul 186Pop the top two numbers on the stack, multiply them, and push their 187product. Now the top of the stack contains the value of the expression 188@code{y * z}. 189 190@item add 191Pop the top two numbers, add them, and push the sum. Now the top of the 192stack contains the value of @code{x + y * z}. 193 194@item end 195Stop executing; the value left on the stack top is the value to be 196recorded. 197 198@end table 199 200 201@node Bytecode Descriptions 202@section Bytecode Descriptions 203 204Each bytecode description has the following form: 205 206@table @asis 207 208@item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b} 209 210Pop the top two stack items, @var{a} and @var{b}, as integers; push 211their sum, as an integer. 212 213@end table 214 215In this example, @code{add} is the name of the bytecode, and 216@code{(0x02)} is the one-byte value used to encode the bytecode, in 217hexidecimal. The phrase ``@var{a} @var{b} @result{} @var{a+b}'' shows 218the stack before and after the bytecode executes. Beforehand, the stack 219must contain at least two values, @var{a} and @var{b}; since the top of 220the stack is to the right, @var{b} is on the top of the stack, and 221@var{a} is underneath it. After execution, the bytecode will have 222popped @var{a} and @var{b} from the stack, and replaced them with a 223single value, @var{a+b}. There may be other values on the stack below 224those shown, but the bytecode affects only those shown. 225 226Here is another example: 227 228@table @asis 229 230@item @code{const8} (0x22) @var{n}: @result{} @var{n} 231Push the 8-bit integer constant @var{n} on the stack, without sign 232extension. 233 234@end table 235 236In this example, the bytecode @code{const8} takes an operand @var{n} 237directly from the bytecode stream; the operand follows the @code{const8} 238bytecode itself. We write any such operands immediately after the name 239of the bytecode, before the colon, and describe the exact encoding of 240the operand in the bytecode stream in the body of the bytecode 241description. 242 243For the @code{const8} bytecode, there are no stack items given before 244the @result{}; this simply means that the bytecode consumes no values 245from the stack. If a bytecode consumes no values, or produces no 246values, the list on either side of the @result{} may be empty. 247 248If a value is written as @var{a}, @var{b}, or @var{n}, then the bytecode 249treats it as an integer. If a value is written is @var{addr}, then the 250bytecode treats it as an address. 251 252We do not fully describe the floating point operations here; although 253this design can be extended in a clean way to handle floating point 254values, they are not of immediate interest to the customer, so we avoid 255describing them, to save time. 256 257 258@table @asis 259 260@item @code{float} (0x01): @result{} 261 262Prefix for floating-point bytecodes. Not implemented yet. 263 264@item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b} 265Pop two integers from the stack, and push their sum, as an integer. 266 267@item @code{sub} (0x03): @var{a} @var{b} @result{} @var{a-b} 268Pop two integers from the stack, subtract the top value from the 269next-to-top value, and push the difference. 270 271@item @code{mul} (0x04): @var{a} @var{b} @result{} @var{a*b} 272Pop two integers from the stack, multiply them, and push the product on 273the stack. Note that, when one multiplies two @var{n}-bit numbers 274yielding another @var{n}-bit number, it is irrelevant whether the 275numbers are signed or not; the results are the same. 276 277@item @code{div_signed} (0x05): @var{a} @var{b} @result{} @var{a/b} 278Pop two signed integers from the stack; divide the next-to-top value by 279the top value, and push the quotient. If the divisor is zero, terminate 280with an error. 281 282@item @code{div_unsigned} (0x06): @var{a} @var{b} @result{} @var{a/b} 283Pop two unsigned integers from the stack; divide the next-to-top value 284by the top value, and push the quotient. If the divisor is zero, 285terminate with an error. 286 287@item @code{rem_signed} (0x07): @var{a} @var{b} @result{} @var{a modulo b} 288Pop two signed integers from the stack; divide the next-to-top value by 289the top value, and push the remainder. If the divisor is zero, 290terminate with an error. 291 292@item @code{rem_unsigned} (0x08): @var{a} @var{b} @result{} @var{a modulo b} 293Pop two unsigned integers from the stack; divide the next-to-top value 294by the top value, and push the remainder. If the divisor is zero, 295terminate with an error. 296 297@item @code{lsh} (0x09): @var{a} @var{b} @result{} @var{a<<b} 298Pop two integers from the stack; let @var{a} be the next-to-top value, 299and @var{b} be the top value. Shift @var{a} left by @var{b} bits, and 300push the result. 301 302@item @code{rsh_signed} (0x0a): @var{a} @var{b} @result{} @code{(signed)}@var{a>>b} 303Pop two integers from the stack; let @var{a} be the next-to-top value, 304and @var{b} be the top value. Shift @var{a} right by @var{b} bits, 305inserting copies of the top bit at the high end, and push the result. 306 307@item @code{rsh_unsigned} (0x0b): @var{a} @var{b} @result{} @var{a>>b} 308Pop two integers from the stack; let @var{a} be the next-to-top value, 309and @var{b} be the top value. Shift @var{a} right by @var{b} bits, 310inserting zero bits at the high end, and push the result. 311 312@item @code{log_not} (0x0e): @var{a} @result{} @var{!a} 313Pop an integer from the stack; if it is zero, push the value one; 314otherwise, push the value zero. 315 316@item @code{bit_and} (0x0f): @var{a} @var{b} @result{} @var{a&b} 317Pop two integers from the stack, and push their bitwise @code{and}. 318 319@item @code{bit_or} (0x10): @var{a} @var{b} @result{} @var{a|b} 320Pop two integers from the stack, and push their bitwise @code{or}. 321 322@item @code{bit_xor} (0x11): @var{a} @var{b} @result{} @var{a^b} 323Pop two integers from the stack, and push their bitwise 324exclusive-@code{or}. 325 326@item @code{bit_not} (0x12): @var{a} @result{} @var{~a} 327Pop an integer from the stack, and push its bitwise complement. 328 329@item @code{equal} (0x13): @var{a} @var{b} @result{} @var{a=b} 330Pop two integers from the stack; if they are equal, push the value one; 331otherwise, push the value zero. 332 333@item @code{less_signed} (0x14): @var{a} @var{b} @result{} @var{a<b} 334Pop two signed integers from the stack; if the next-to-top value is less 335than the top value, push the value one; otherwise, push the value zero. 336 337@item @code{less_unsigned} (0x15): @var{a} @var{b} @result{} @var{a<b} 338Pop two unsigned integers from the stack; if the next-to-top value is less 339than the top value, push the value one; otherwise, push the value zero. 340 341@item @code{ext} (0x16) @var{n}: @var{a} @result{} @var{a}, sign-extended from @var{n} bits 342Pop an unsigned value from the stack; treating it as an @var{n}-bit 343twos-complement value, extend it to full length. This means that all 344bits to the left of bit @var{n-1} (where the least significant bit is bit 3450) are set to the value of bit @var{n-1}. Note that @var{n} may be 346larger than or equal to the width of the stack elements of the bytecode 347engine; in this case, the bytecode should have no effect. 348 349The number of source bits to preserve, @var{n}, is encoded as a single 350byte unsigned integer following the @code{ext} bytecode. 351 352@item @code{zero_ext} (0x2a) @var{n}: @var{a} @result{} @var{a}, zero-extended from @var{n} bits 353Pop an unsigned value from the stack; zero all but the bottom @var{n} 354bits. This means that all bits to the left of bit @var{n-1} (where the 355least significant bit is bit 0) are set to the value of bit @var{n-1}. 356 357The number of source bits to preserve, @var{n}, is encoded as a single 358byte unsigned integer following the @code{zero_ext} bytecode. 359 360@item @code{ref8} (0x17): @var{addr} @result{} @var{a} 361@itemx @code{ref16} (0x18): @var{addr} @result{} @var{a} 362@itemx @code{ref32} (0x19): @var{addr} @result{} @var{a} 363@itemx @code{ref64} (0x1a): @var{addr} @result{} @var{a} 364Pop an address @var{addr} from the stack. For bytecode 365@code{ref}@var{n}, fetch an @var{n}-bit value from @var{addr}, using the 366natural target endianness. Push the fetched value as an unsigned 367integer. 368 369Note that @var{addr} may not be aligned in any particular way; the 370@code{ref@var{n}} bytecodes should operate correctly for any address. 371 372If attempting to access memory at @var{addr} would cause a processor 373exception of some sort, terminate with an error. 374 375@item @code{ref_float} (0x1b): @var{addr} @result{} @var{d} 376@itemx @code{ref_double} (0x1c): @var{addr} @result{} @var{d} 377@itemx @code{ref_long_double} (0x1d): @var{addr} @result{} @var{d} 378@itemx @code{l_to_d} (0x1e): @var{a} @result{} @var{d} 379@itemx @code{d_to_l} (0x1f): @var{d} @result{} @var{a} 380Not implemented yet. 381 382@item @code{dup} (0x28): @var{a} => @var{a} @var{a} 383Push another copy of the stack's top element. 384 385@item @code{swap} (0x2b): @var{a} @var{b} => @var{b} @var{a} 386Exchange the top two items on the stack. 387 388@item @code{pop} (0x29): @var{a} => 389Discard the top value on the stack. 390 391@item @code{if_goto} (0x20) @var{offset}: @var{a} @result{} 392Pop an integer off the stack; if it is non-zero, branch to the given 393offset in the bytecode string. Otherwise, continue to the next 394instruction in the bytecode stream. In other words, if @var{a} is 395non-zero, set the @code{pc} register to @code{start} + @var{offset}. 396Thus, an offset of zero denotes the beginning of the expression. 397 398The @var{offset} is stored as a sixteen-bit unsigned value, stored 399immediately following the @code{if_goto} bytecode. It is always stored 400most significant byte first, regardless of the target's normal 401endianness. The offset is not guaranteed to fall at any particular 402alignment within the bytecode stream; thus, on machines where fetching a 40316-bit on an unaligned address raises an exception, you should fetch the 404offset one byte at a time. 405 406@item @code{goto} (0x21) @var{offset}: @result{} 407Branch unconditionally to @var{offset}; in other words, set the 408@code{pc} register to @code{start} + @var{offset}. 409 410The offset is stored in the same way as for the @code{if_goto} bytecode. 411 412@item @code{const8} (0x22) @var{n}: @result{} @var{n} 413@itemx @code{const16} (0x23) @var{n}: @result{} @var{n} 414@itemx @code{const32} (0x24) @var{n}: @result{} @var{n} 415@itemx @code{const64} (0x25) @var{n}: @result{} @var{n} 416Push the integer constant @var{n} on the stack, without sign extension. 417To produce a small negative value, push a small twos-complement value, 418and then sign-extend it using the @code{ext} bytecode. 419 420The constant @var{n} is stored in the appropriate number of bytes 421following the @code{const}@var{b} bytecode. The constant @var{n} is 422always stored most significant byte first, regardless of the target's 423normal endianness. The constant is not guaranteed to fall at any 424particular alignment within the bytecode stream; thus, on machines where 425fetching a 16-bit on an unaligned address raises an exception, you 426should fetch @var{n} one byte at a time. 427 428@item @code{reg} (0x26) @var{n}: @result{} @var{a} 429Push the value of register number @var{n}, without sign extension. The 430registers are numbered following GDB's conventions. 431 432The register number @var{n} is encoded as a 16-bit unsigned integer 433immediately following the @code{reg} bytecode. It is always stored most 434significant byte first, regardless of the target's normal endianness. 435The register number is not guaranteed to fall at any particular 436alignment within the bytecode stream; thus, on machines where fetching a 43716-bit on an unaligned address raises an exception, you should fetch the 438register number one byte at a time. 439 440@item @code{trace} (0x0c): @var{addr} @var{size} @result{} 441Record the contents of the @var{size} bytes at @var{addr} in a trace 442buffer, for later retrieval by GDB. 443 444@item @code{trace_quick} (0x0d) @var{size}: @var{addr} @result{} @var{addr} 445Record the contents of the @var{size} bytes at @var{addr} in a trace 446buffer, for later retrieval by GDB. @var{size} is a single byte 447unsigned integer following the @code{trace} opcode. 448 449This bytecode is equivalent to the sequence @code{dup const8 @var{size} 450trace}, but we provide it anyway to save space in bytecode strings. 451 452@item @code{trace16} (0x30) @var{size}: @var{addr} @result{} @var{addr} 453Identical to trace_quick, except that @var{size} is a 16-bit big-endian 454unsigned integer, not a single byte. This should probably have been 455named @code{trace_quick16}, for consistency. 456 457@item @code{end} (0x27): @result{} 458Stop executing bytecode; the result should be the top element of the 459stack. If the purpose of the expression was to compute an lvalue or a 460range of memory, then the next-to-top of the stack is the lvalue's 461address, and the top of the stack is the lvalue's size, in bytes. 462 463@end table 464 465 466@node Using Agent Expressions 467@section Using Agent Expressions 468 469Here is a sketch of a full non-stop debugging cycle, showing how agent 470expressions fit into the process. 471 472@itemize @bullet 473 474@item 475The user selects trace points in the program's code at which GDB should 476collect data. 477 478@item 479The user specifies expressions to evaluate at each trace point. These 480expressions may denote objects in memory, in which case those objects' 481contents are recorded as the program runs, or computed values, in which 482case the values themselves are recorded. 483 484@item 485GDB transmits the tracepoints and their associated expressions to the 486GDB agent, running on the debugging target. 487 488@item 489The agent arranges to be notified when a trace point is hit. Note that, 490on some systems, the target operating system is completely responsible 491for collecting the data; see @ref{Tracing on Symmetrix}. 492 493@item 494When execution on the target reaches a trace point, the agent evaluates 495the expressions associated with that trace point, and records the 496resulting values and memory ranges. 497 498@item 499Later, when the user selects a given trace event and inspects the 500objects and expression values recorded, GDB talks to the agent to 501retrieve recorded data as necessary to meet the user's requests. If the 502user asks to see an object whose contents have not been recorded, GDB 503reports an error. 504 505@end itemize 506 507 508@node Varying Target Capabilities 509@section Varying Target Capabilities 510 511Some targets don't support floating-point, and some would rather not 512have to deal with @code{long long} operations. Also, different targets 513will have different stack sizes, and different bytecode buffer lengths. 514 515Thus, GDB needs a way to ask the target about itself. We haven't worked 516out the details yet, but in general, GDB should be able to send the 517target a packet asking it to describe itself. The reply should be a 518packet whose length is explicit, so we can add new information to the 519packet in future revisions of the agent, without confusing old versions 520of GDB, and it should contain a version number. It should contain at 521least the following information: 522 523@itemize @bullet 524 525@item 526whether floating point is supported 527 528@item 529whether @code{long long} is supported 530 531@item 532maximum acceptable size of bytecode stack 533 534@item 535maximum acceptable length of bytecode expressions 536 537@item 538which registers are actually available for collection 539 540@item 541whether the target supports disabled tracepoints 542 543@end itemize 544 545 546 547@node Tracing on Symmetrix 548@section Tracing on Symmetrix 549 550This section documents the API used by the GDB agent to collect data on 551Symmetrix systems. 552 553Cygnus originally implemented these tracing features to help EMC 554Corporation debug their Symmetrix high-availability disk drives. The 555Symmetrix application code already includes substantial tracing 556facilities; the GDB agent for the Symmetrix system uses those facilities 557for its own data collection, via the API described here. 558 559@deftypefn Function DTC_RESPONSE adbg_find_memory_in_frame (FRAME_DEF *@var{frame}, char *@var{address}, char **@var{buffer}, unsigned int *@var{size}) 560Search the trace frame @var{frame} for memory saved from @var{address}. 561If the memory is available, provide the address of the buffer holding 562it; otherwise, provide the address of the next saved area. 563 564@itemize @bullet 565 566@item 567If the memory at @var{address} was saved in @var{frame}, set 568@code{*@var{buffer}} to point to the buffer in which that memory was 569saved, set @code{*@var{size}} to the number of bytes from @var{address} 570that are saved at @code{*@var{buffer}}, and return 571@code{OK_TARGET_RESPONSE}. (Clearly, in this case, the function will 572always set @code{*@var{size}} to a value greater than zero.) 573 574@item 575If @var{frame} does not record any memory at @var{address}, set 576@code{*@var{size}} to the distance from @var{address} to the start of 577the saved region with the lowest address higher than @var{address}. If 578there is no memory saved from any higher address, set @code{*@var{size}} 579to zero. Return @code{NOT_FOUND_TARGET_RESPONSE}. 580@end itemize 581 582These two possibilities allow the caller to either retrieve the data, or 583walk the address space to the next saved area. 584@end deftypefn 585 586This function allows the GDB agent to map the regions of memory saved in 587a particular frame, and retrieve their contents efficiently. 588 589This function also provides a clean interface between the GDB agent and 590the Symmetrix tracing structures, making it easier to adapt the GDB 591agent to future versions of the Symmetrix system, and vice versa. This 592function searches all data saved in @var{frame}, whether the data is 593there at the request of a bytecode expression, or because it falls in 594one of the format's memory ranges, or because it was saved from the top 595of the stack. EMC can arbitrarily change and enhance the tracing 596mechanism, but as long as this function works properly, all collected 597memory is visible to GDB. 598 599The function itself is straightforward to implement. A single pass over 600the trace frame's stack area, memory ranges, and expression blocks can 601yield the address of the buffer (if the requested address was saved), 602and also note the address of the next higher range of memory, to be 603returned when the search fails. 604 605As an example, suppose the trace frame @code{f} has saved sixteen bytes 606from address @code{0x8000} in a buffer at @code{0x1000}, and thirty-two 607bytes from address @code{0xc000} in a buffer at @code{0x1010}. Here are 608some sample calls, and the effect each would have: 609 610@table @code 611 612@item adbg_find_memory_in_frame (f, (char*) 0x8000, &buffer, &size) 613This would set @code{buffer} to @code{0x1000}, set @code{size} to 614sixteen, and return @code{OK_TARGET_RESPONSE}, since @code{f} saves 615sixteen bytes from @code{0x8000} at @code{0x1000}. 616 617@item adbg_find_memory_in_frame (f, (char *) 0x8004, &buffer, &size) 618This would set @code{buffer} to @code{0x1004}, set @code{size} to 619twelve, and return @code{OK_TARGET_RESPONSE}, since @file{f} saves the 620twelve bytes from @code{0x8004} starting four bytes into the buffer at 621@code{0x1000}. This shows that request addresses may fall in the middle 622of saved areas; the function should return the address and size of the 623remainder of the buffer. 624 625@item adbg_find_memory_in_frame (f, (char *) 0x8100, &buffer, &size) 626This would set @code{size} to @code{0x3f00} and return 627@code{NOT_FOUND_TARGET_RESPONSE}, since there is no memory saved in 628@code{f} from the address @code{0x8100}, and the next memory available 629is at @code{0x8100 + 0x3f00}, or @code{0xc000}. This shows that request 630addresses may fall outside of all saved memory ranges; the function 631should indicate the next saved area, if any. 632 633@item adbg_find_memory_in_frame (f, (char *) 0x7000, &buffer, &size) 634This would set @code{size} to @code{0x1000} and return 635@code{NOT_FOUND_TARGET_RESPONSE}, since the next saved memory is at 636@code{0x7000 + 0x1000}, or @code{0x8000}. 637 638@item adbg_find_memory_in_frame (f, (char *) 0xf000, &buffer, &size) 639This would set @code{size} to zero, and return 640@code{NOT_FOUND_TARGET_RESPONSE}. This shows how the function tells the 641caller that no further memory ranges have been saved. 642 643@end table 644 645As another example, here is a function which will print out the 646addresses of all memory saved in the trace frame @code{frame} on the 647Symmetrix INLINES console: 648@example 649void 650print_frame_addresses (FRAME_DEF *frame) 651@{ 652 char *addr; 653 char *buffer; 654 unsigned long size; 655 656 addr = 0; 657 for (;;) 658 @{ 659 /* Either find out how much memory we have here, or discover 660 where the next saved region is. */ 661 if (adbg_find_memory_in_frame (frame, addr, &buffer, &size) 662 == OK_TARGET_RESPONSE) 663 printp ("saved %x to %x\n", addr, addr + size); 664 if (size == 0) 665 break; 666 addr += size; 667 @} 668@} 669@end example 670 671Note that there is not necessarily any connection between the order in 672which the data is saved in the trace frame, and the order in which 673@code{adbg_find_memory_in_frame} will return those memory ranges. The 674code above will always print the saved memory regions in order of 675increasing address, while the underlying frame structure might store the 676data in a random order. 677 678[[This section should cover the rest of the Symmetrix functions the stub 679relies upon, too.]] 680 681@node Rationale 682@section Rationale 683 684Some of the design decisions apparent above are arguable. 685 686@table @b 687 688@item What about stack overflow/underflow? 689GDB should be able to query the target to discover its stack size. 690Given that information, GDB can determine at translation time whether a 691given expression will overflow the stack. But this spec isn't about 692what kinds of error-checking GDB ought to do. 693 694@item Why are you doing everything in LONGEST? 695 696Speed isn't important, but agent code size is; using LONGEST brings in a 697bunch of support code to do things like division, etc. So this is a 698serious concern. 699 700First, note that you don't need different bytecodes for different 701operand sizes. You can generate code without @emph{knowing} how big the 702stack elements actually are on the target. If the target only supports 70332-bit ints, and you don't send any 64-bit bytecodes, everything just 704works. The observation here is that the MIPS and the Alpha have only 705fixed-size registers, and you can still get C's semantics even though 706most instructions only operate on full-sized words. You just need to 707make sure everything is properly sign-extended at the right times. So 708there is no need for 32- and 64-bit variants of the bytecodes. Just 709implement everything using the largest size you support. 710 711GDB should certainly check to see what sizes the target supports, so the 712user can get an error earlier, rather than later. But this information 713is not necessary for correctness. 714 715 716@item Why don't you have @code{>} or @code{<=} operators? 717I want to keep the interpreter small, and we don't need them. We can 718combine the @code{less_} opcodes with @code{log_not}, and swap the order 719of the operands, yielding all four asymmetrical comparison operators. 720For example, @code{(x <= y)} is @code{! (x > y)}, which is @code{! (y < 721x)}. 722 723@item Why do you have @code{log_not}? 724@itemx Why do you have @code{ext}? 725@itemx Why do you have @code{zero_ext}? 726These are all easily synthesized from other instructions, but I expect 727them to be used frequently, and they're simple, so I include them to 728keep bytecode strings short. 729 730@code{log_not} is equivalent to @code{const8 0 equal}; it's used in half 731the relational operators. 732 733@code{ext @var{n}} is equivalent to @code{const8 @var{s-n} lsh const8 734@var{s-n} rsh_signed}, where @var{s} is the size of the stack elements; 735it follows @code{ref@var{m}} and @var{reg} bytecodes when the value 736should be signed. See the next bulleted item. 737 738@code{zero_ext @var{n}} is equivalent to @code{const@var{m} @var{mask} 739log_and}; it's used whenever we push the value of a register, because we 740can't assume the upper bits of the register aren't garbage. 741 742@item Why not have sign-extending variants of the @code{ref} operators? 743Because that would double the number of @code{ref} operators, and we 744need the @code{ext} bytecode anyway for accessing bitfields. 745 746@item Why not have constant-address variants of the @code{ref} operators? 747Because that would double the number of @code{ref} operators again, and 748@code{const32 @var{address} ref32} is only one byte longer. 749 750@item Why do the @code{ref@var{n}} operators have to support unaligned fetches? 751GDB will generate bytecode that fetches multi-byte values at unaligned 752addresses whenever the executable's debugging information tells it to. 753Furthermore, GDB does not know the value the pointer will have when GDB 754generates the bytecode, so it cannot determine whether a particular 755fetch will be aligned or not. 756 757In particular, structure bitfields may be several bytes long, but follow 758no alignment rules; members of packed structures are not necessarily 759aligned either. 760 761In general, there are many cases where unaligned references occur in 762correct C code, either at the programmer's explicit request, or at the 763compiler's discretion. Thus, it is simpler to make the GDB agent 764bytecodes work correctly in all circumstances than to make GDB guess in 765each case whether the compiler did the usual thing. 766 767@item Why are there no side-effecting operators? 768Because our current client doesn't want them? That's a cheap answer. I 769think the real answer is that I'm afraid of implementing function 770calls. We should re-visit this issue after the present contract is 771delivered. 772 773@item Why aren't the @code{goto} ops PC-relative? 774The interpreter has the base address around anyway for PC bounds 775checking, and it seemed simpler. 776 777@item Why is there only one offset size for the @code{goto} ops? 778Offsets are currently sixteen bits. I'm not happy with this situation 779either: 780 781Suppose we have multiple branch ops with different offset sizes. As I 782generate code left-to-right, all my jumps are forward jumps (there are 783no loops in expressions), so I never know the target when I emit the 784jump opcode. Thus, I have to either always assume the largest offset 785size, or do jump relaxation on the code after I generate it, which seems 786like a big waste of time. 787 788I can imagine a reasonable expression being longer than 256 bytes. I 789can't imagine one being longer than 64k. Thus, we need 16-bit offsets. 790This kind of reasoning is so bogus, but relaxation is pathetic. 791 792The other approach would be to generate code right-to-left. Then I'd 793always know my offset size. That might be fun. 794 795@item Where is the function call bytecode? 796 797When we add side-effects, we should add this. 798 799@item Why does the @code{reg} bytecode take a 16-bit register number? 800 801Intel's IA-64 architecture has 128 general-purpose registers, 802and 128 floating-point registers, and I'm sure it has some random 803control registers. 804 805@item Why do we need @code{trace} and @code{trace_quick}? 806Because GDB needs to record all the memory contents and registers an 807expression touches. If the user wants to evaluate an expression 808@code{x->y->z}, the agent must record the values of @code{x} and 809@code{x->y} as well as the value of @code{x->y->z}. 810 811@item Don't the @code{trace} bytecodes make the interpreter less general? 812They do mean that the interpreter contains special-purpose code, but 813that doesn't mean the interpreter can only be used for that purpose. If 814an expression doesn't use the @code{trace} bytecodes, they don't get in 815its way. 816 817@item Why doesn't @code{trace_quick} consume its arguments the way everything else does? 818In general, you do want your operators to consume their arguments; it's 819consistent, and generally reduces the amount of stack rearrangement 820necessary. However, @code{trace_quick} is a kludge to save space; it 821only exists so we needn't write @code{dup const8 @var{SIZE} trace} 822before every memory reference. Therefore, it's okay for it not to 823consume its arguments; it's meant for a specific context in which we 824know exactly what it should do with the stack. If we're going to have a 825kludge, it should be an effective kludge. 826 827@item Why does @code{trace16} exist? 828That opcode was added by the customer that contracted Cygnus for the 829data tracing work. I personally think it is unnecessary; objects that 830large will be quite rare, so it is okay to use @code{dup const16 831@var{size} trace} in those cases. 832 833Whatever we decide to do with @code{trace16}, we should at least leave 834opcode 0x30 reserved, to remain compatible with the customer who added 835it. 836 837@end table 838