1Python Scripting 2================ 3 4LLDB has been structured from the beginning to be scriptable in two 5ways -- a Unix Python session can initiate/run a debug session 6non-interactively using LLDB; and within the LLDB debugger tool, Python 7scripts can be used to help with many tasks, including inspecting 8program data, iterating over containers and determining if a breakpoint 9should stop execution or continue. This document will show how to do 10some of these things by going through an example, explaining how to use 11Python scripting to find a bug in a program that searches for text in a 12large binary tree. 13 14.. contents:: 15 :local: 16 17The Test Program and Input 18-------------------------- 19 20We have a simple C program (dictionary.c) that reads in a text file, 21and stores all the words from the file in a Binary Search Tree, sorted 22alphabetically. It then enters a loop prompting the user for a word, 23searching for the word in the tree (using Binary Search), and reporting 24to the user whether or not it found the word in the tree. 25 26The input text file we are using to test our program contains the text 27for William Shakespeare's famous tragedy "Romeo and Juliet". 28 29The Bug 30------- 31 32When we try running our program, we find there is a problem. While it 33successfully finds some of the words we would expect to find, such as 34"love" or "sun", it fails to find the word "Romeo", which MUST be in 35the input text file: 36 37:: 38 39 $ ./dictionary Romeo-and-Juliet.txt 40 Dictionary loaded. 41 Enter search word: love 42 Yes! 43 Enter search word: sun 44 Yes! 45 Enter search word: Romeo 46 No! 47 Enter search word: ^D 48 $ 49 50Using Depth First Search 51------------------------ 52 53Our first job is to determine if the word "Romeo" actually got inserted 54into the tree or not. Since "Romeo and Juliet" has thousands of words, 55trying to examine our binary search tree by hand is completely 56impractical. Therefore we will write a Python script to search the tree 57for us. We will write a recursive Depth First Search function that 58traverses the entire tree searching for a word, and maintaining 59information about the path from the root of the tree to the current 60node. If it finds the word in the tree, it returns the path from the 61root to the node containing the word. This is what our DFS function in 62Python would look like, with line numbers added for easy reference in 63later explanations: 64 65:: 66 67 1: def DFS (root, word, cur_path): 68 2: root_word_ptr = root.GetChildMemberWithName ("word") 69 3: left_child_ptr = root.GetChildMemberWithName ("left") 70 4: right_child_ptr = root.GetChildMemberWithName ("right") 71 5: root_word = root_word_ptr.GetSummary() 72 6: end = len (root_word) - 1 73 7: if root_word[0] == '"' and root_word[end] == '"': 74 8: root_word = root_word[1:end] 75 9: end = len (root_word) - 1 76 10: if root_word[0] == '\'' and root_word[end] == '\'': 77 11: root_word = root_word[1:end] 78 12: if root_word == word: 79 13: return cur_path 80 14: elif word < root_word: 81 15: if left_child_ptr.GetValue() == None: 82 16: return "" 83 17: else: 84 18: cur_path = cur_path + "L" 85 19: return DFS (left_child_ptr, word, cur_path) 86 20: else: 87 21: if right_child_ptr.GetValue() == None: 88 22: return "" 89 23: else: 90 24: cur_path = cur_path + "R" 91 25: return DFS (right_child_ptr, word, cur_path) 92 93 94Accessing & Manipulating Program Variables 95------------------------------------------ 96 97Before we can call any Python function on any of our program's 98variables, we need to get the variable into a form that Python can 99access. To show you how to do this we will look at the parameters for 100the DFS function. The first parameter is going to be a node in our 101binary search tree, put into a Python variable. The second parameter is 102the word we are searching for (a string), and the third parameter is a 103string representing the path from the root of the tree to our current 104node. 105 106The most interesting parameter is the first one, the Python variable 107that needs to contain a node in our search tree. How can we take a 108variable out of our program and put it into a Python variable? What 109kind of Python variable will it be? The answers are to use the LLDB API 110functions, provided as part of the LLDB Python module. Running Python 111from inside LLDB, LLDB will automatically give us our current frame 112object as a Python variable, "lldb.frame". This variable has the type 113`SBFrame` (see the LLDB API for more information about `SBFrame` 114objects). One of the things we can do with a frame object, is to ask it 115to find and return its local variable. We will call the API function 116`SBFrame.FindVariable` on the lldb.frame object to give us our dictionary 117variable as a Python variable: 118 119:: 120 121 root = lldb.frame.FindVariable ("dictionary") 122 123The line above, executed in the Python script interpreter in LLDB, asks the 124current frame to find the variable named "dictionary" and return it. We then 125store the returned value in the Python variable named "root". This answers the 126question of HOW to get the variable, but it still doesn't explain WHAT actually 127gets put into "root". If you examine the LLDB API, you will find that the 128`SBFrame` method "FindVariable" returns an object of type `SBValue`. `SBValue` 129objects are used, among other things, to wrap up program variables and values. 130There are many useful methods defined in the `SBValue` class to allow you to get 131information or children values out of SBValues. For complete information, see 132the header file SBValue.h. The `SBValue` methods that we use in our DFS function 133are ``GetChildMemberWithName()``, ``GetSummary()``, and ``GetValue()``. 134 135 136Explaining DFS Script in Detail 137------------------------------- 138 139Before diving into the details of this code, it would be best to give a 140high-level overview of what it does. The nodes in our binary search tree were 141defined to have type ``tree_node *``, which is defined as: 142 143:: 144 145 typedef struct tree_node 146 { 147 const char *word; 148 struct tree_node *left; 149 struct tree_node *right; 150 } tree_node; 151 152Lines 2-11 of DFS are getting data out of the current tree node and getting 153ready to do the actual search; lines 12-25 are the actual depth-first search. 154Lines 2-4 of our DFS function get the word, left and right fields out of the 155current node and store them in Python variables. Since root_word_ptr is a 156pointer to our word, and we want the actual word, line 5 calls GetSummary() to 157get a string containing the value out of the pointer. Since GetSummary() adds 158quotes around its result, lines 6-11 strip surrounding quotes off the word. 159 160Line 12 checks to see if the word in the current node is the one we are 161searching for. If so, we are done, and line 13 returns the current path. 162Otherwise, line 14 checks to see if we should go left (search word comes before 163the current word). If we decide to go left, line 15 checks to see if the left 164pointer child is NULL ("None" is the Python equivalent of NULL). If the left 165pointer is NULL, then the word is not in this tree and we return an empty path 166(line 16). Otherwise, we add an "L" to the end of our current path string, to 167indicate we are going left (line 18), and then recurse on the left child (line 16819). Lines 20-25 are the same as lines 14-19, except for going right rather 169than going left. 170 171One other note: Typing something as long as our DFS function directly into the 172interpreter can be difficult, as making a single typing mistake means having to 173start all over. Therefore we recommend doing as we have done: Writing your 174longer, more complicated script functions in a separate file (in this case 175tree_utils.py) and then importing it into your LLDB Python interpreter. 176 177 178The DFS Script in Action 179------------------------ 180 181At this point we are ready to use the DFS function to see if the word "Romeo" 182is in our tree or not. To actually use it in LLDB on our dictionary program, 183you would do something like this: 184 185:: 186 187 $ lldb 188 (lldb) process attach -n "dictionary" 189 Architecture set to: x86_64. 190 Process 521 stopped 191 * thread #1: tid = 0x2c03, 0x00007fff86c8bea0 libSystem.B.dylib`read$NOCANCEL + 8, stop reason = signal SIGSTOP 192 frame #0: 0x00007fff86c8bea0 libSystem.B.dylib`read$NOCANCEL + 8 193 (lldb) breakpoint set -n find_word 194 Breakpoint created: 1: name = 'find_word', locations = 1, resolved = 1 195 (lldb) continue 196 Process 521 resuming 197 Process 521 stopped 198 * thread #1: tid = 0x2c03, 0x0000000100001830 dictionary`find_word + 16 199 at dictionary.c:105, stop reason = breakpoint 1.1 200 frame #0: 0x0000000100001830 dictionary`find_word + 16 at dictionary.c:105 201 102 int 202 103 find_word (tree_node *dictionary, char *word) 203 104 { 204 -> 105 if (!word || !dictionary) 205 106 return 0; 206 107 207 108 int compare_value = strcmp (word, dictionary->word); 208 (lldb) script 209 Python Interactive Interpreter. To exit, type 'quit()', 'exit()' or Ctrl-D. 210 >>> import tree_utils 211 >>> root = lldb.frame.FindVariable ("dictionary") 212 >>> current_path = "" 213 >>> path = tree_utils.DFS (root, "Romeo", current_path) 214 >>> print path 215 LLRRL 216 >>> ^D 217 (lldb) 218 219The first bit of code above shows starting lldb, attaching to the dictionary 220program, and getting to the find_word function in LLDB. The interesting part 221(as far as this example is concerned) begins when we enter the script command 222and drop into the embedded interactive Python interpreter. We will go over this 223Python code line by line. The first line 224 225:: 226 227 import tree_utils 228 229 230imports the file where we wrote our DFS function, tree_utils.py, into Python. 231Notice that to import the file we leave off the ".py" extension. We can now 232call any function in that file, giving it the prefix "tree_utils.", so that 233Python knows where to look for the function. The line 234 235:: 236 237 root = lldb.frame.FindVariable ("dictionary") 238 239 240gets our program variable "dictionary" (which contains the binary search tree) 241and puts it into the Python variable "root". See Accessing & Manipulating 242Program Variables in Python above for more details about how this works. The 243next line is 244 245:: 246 247 current_path = "" 248 249This line initializes the current_path from the root of the tree to our current 250node. Since we are starting at the root of the tree, our current path starts as 251an empty string. As we go right and left through the tree, the DFS function 252will append an 'R' or an 'L' to the current path, as appropriate. The line 253 254:: 255 256 path = tree_utils.DFS (root, "Romeo", current_path) 257 258calls our DFS function (prefixing it with the module name so that Python can 259find it). We pass in our binary tree stored in the variable root, the word we 260are searching for, and our current path. We assign whatever path the DFS 261function returns to the Python variable path. 262 263Finally, we want to see if the word was found or not, and if so we want to see 264the path through the tree to the word. So we do 265 266:: 267 268 print path 269 270From this we can see that the word "Romeo" was indeed found in the tree, and 271the path from the root of the tree to the node containing "Romeo" is 272left-left-right-right-left. 273 274Using Breakpoint Command Scripts 275-------------------------------- 276 277We are halfway to figuring out what the problem is. We know the word we are 278looking for is in the binary tree, and we know exactly where it is in the 279binary tree. Now we need to figure out why our binary search algorithm is not 280finding the word. We will do this using breakpoint command scripts. 281 282The idea is as follows. The binary search algorithm has two main decision 283points: the decision to follow the right branch; and, the decision to follow 284the left branch. We will set a breakpoint at each of these decision points, and 285attach a Python breakpoint command script to each breakpoint. The breakpoint 286commands will use the global path Python variable that we got from our DFS 287function. Each time one of these decision breakpoints is hit, the script will 288compare the actual decision with the decision the front of the path variable 289says should be made (the first character of the path). If the actual decision 290and the path agree, then the front character is stripped off the path, and 291execution is resumed. In this case the user never even sees the breakpoint 292being hit. But if the decision differs from what the path says it should be, 293then the script prints out a message and does NOT resume execution, leaving the 294user sitting at the first point where a wrong decision is being made. 295 296Python Breakpoint Command Scripts Are Not What They Seem 297-------------------------------------------------------- 298 299What do we mean by that? When you enter a Python breakpoint command in LLDB, it 300appears that you are entering one or more plain lines of Python. BUT LLDB then 301takes what you entered and wraps it into a Python FUNCTION (just like using the 302"def" Python command). It automatically gives the function an obscure, unique, 303hard-to-stumble-across function name, and gives it two parameters: frame and 304bp_loc. When the breakpoint gets hit, LLDB wraps up the frame object where the 305breakpoint was hit, and the breakpoint location object for the breakpoint that 306was hit, and puts them into Python variables for you. It then calls the Python 307function that was created for the breakpoint command, and passes in the frame 308and breakpoint location objects. 309 310So, being practical, what does this mean for you when you write your Python 311breakpoint commands? It means that there are two things you need to keep in 312mind: 1. If you want to access any Python variables created outside your 313script, you must declare such variables to be global. If you do not declare 314them as global, then the Python function will treat them as local variables, 315and you will get unexpected behavior. 2. All Python breakpoint command scripts 316automatically have a frame and a bp_loc variable. The variables are pre-loaded 317by LLDB with the correct context for the breakpoint. You do not have to use 318these variables, but they are there if you want them. 319 320The Decision Point Breakpoint Commands 321-------------------------------------- 322 323This is what the Python breakpoint command script would look like for the 324decision to go right: 325 326:: 327 328 global path 329 if path[0] == 'R': 330 path = path[1:] 331 thread = frame.GetThread() 332 process = thread.GetProcess() 333 process.Continue() 334 else: 335 print "Here is the problem; going right, should go left!" 336 Just as a reminder, LLDB is going to take this script and wrap it up in a function, like this: 337 338 339 def some_unique_and_obscure_function_name (frame, bp_loc): 340 global path 341 if path[0] == 'R': 342 path = path[1:] 343 thread = frame.GetThread() 344 process = thread.GetProcess() 345 process.Continue() 346 else: 347 print "Here is the problem; going right, should go left!" 348 349LLDB will call the function, passing in the correct frame and breakpoint 350location whenever the breakpoint gets hit. There are several things to notice 351about this function. The first one is that we are accessing and updating a 352piece of state (the path variable), and actually conditioning our behavior 353based upon this variable. Since the variable was defined outside of our script 354(and therefore outside of the corresponding function) we need to tell Python 355that we are accessing a global variable. That is what the first line of the 356script does. Next we check where the path says we should go and compare it to 357our decision (recall that we are at the breakpoint for the decision to go 358right). If the path agrees with our decision, then we strip the first character 359off of the path. 360 361Since the decision matched the path, we want to resume execution. To do this we 362make use of the frame parameter that LLDB guarantees will be there for us. We 363use LLDB API functions to get the current thread from the current frame, and 364then to get the process from the thread. Once we have the process, we tell it 365to resume execution (using the Continue() API function). 366 367If the decision to go right does not agree with the path, then we do not resume 368execution. We allow the breakpoint to remain stopped (by doing nothing), and we 369print an informational message telling the user we have found the problem, and 370what the problem is. 371 372Actually Using The Breakpoint Commands 373-------------------------------------- 374 375Now we will look at what happens when we actually use these breakpoint commands 376on our program. Doing a source list -n find_word shows us the function 377containing our two decision points. Looking at the code below, we see that we 378want to set our breakpoints on lines 113 and 115: 379 380:: 381 382 (lldb) source list -n find_word 383 File: /Volumes/Data/HD2/carolinetice/Desktop/LLDB-Web-Examples/dictionary.c. 384 101 385 102 int 386 103 find_word (tree_node *dictionary, char *word) 387 104 { 388 105 if (!word || !dictionary) 389 106 return 0; 390 107 391 108 int compare_value = strcmp (word, dictionary->word); 392 109 393 110 if (compare_value == 0) 394 111 return 1; 395 112 else if (compare_value < 0) 396 113 return find_word (dictionary->left, word); 397 114 else 398 115 return find_word (dictionary->right, word); 399 116 } 400 117 401 402 403So, we set our breakpoints, enter our breakpoint command scripts, and see what happens: 404 405:: 406 407 (lldb) breakpoint set -l 113 408 Breakpoint created: 2: file ='dictionary.c', line = 113, locations = 1, resolved = 1 409 (lldb) breakpoint set -l 115 410 Breakpoint created: 3: file ='dictionary.c', line = 115, locations = 1, resolved = 1 411 (lldb) breakpoint command add -s python 2 412 Enter your Python command(s). Type 'DONE' to end. 413 > global path 414 > if (path[0] == 'L'): 415 > path = path[1:] 416 > thread = frame.GetThread() 417 > process = thread.GetProcess() 418 > process.Continue() 419 > else: 420 > print "Here is the problem. Going left, should go right!" 421 > DONE 422 (lldb) breakpoint command add -s python 3 423 Enter your Python command(s). Type 'DONE' to end. 424 > global path 425 > if (path[0] == 'R'): 426 > path = path[1:] 427 > thread = frame.GetThread() 428 > process = thread.GetProcess() 429 > process.Continue() 430 > else: 431 > print "Here is the problem. Going right, should go left!" 432 > DONE 433 (lldb) continue 434 Process 696 resuming 435 Here is the problem. Going right, should go left! 436 Process 696 stopped 437 * thread #1: tid = 0x2d03, 0x000000010000189f dictionary`find_word + 127 at dictionary.c:115, stop reason = breakpoint 3.1 438 frame #0: 0x000000010000189f dictionary`find_word + 127 at dictionary.c:115 439 112 else if (compare_value < 0) 440 113 return find_word (dictionary->left, word); 441 114 else 442 -> 115 return find_word (dictionary->right, word); 443 116 } 444 117 445 118 void 446 (lldb) 447 448 449After setting our breakpoints, adding our breakpoint commands and continuing, 450we run for a little bit and then hit one of our breakpoints, printing out the 451error message from the breakpoint command. Apparently at this point in the 452tree, our search algorithm decided to go right, but our path says the node we 453want is to the left. Examining the word at the node where we stopped, and our 454search word, we see: 455 456:: 457 458 (lldb) expr dictionary->word 459 (const char *) $1 = 0x0000000100100080 "dramatis" 460 (lldb) expr word 461 (char *) $2 = 0x00007fff5fbff108 "romeo" 462 463So the word at our current node is "dramatis", and the word we are searching 464for is "romeo". "romeo" comes after "dramatis" alphabetically, so it seems like 465going right would be the correct decision. Let's ask Python what it thinks the 466path from the current node to our word is: 467 468:: 469 470 (lldb) script print path 471 LLRRL 472 473According to Python we need to go left-left-right-right-left from our current 474node to find the word we are looking for. Let's double check our tree, and see 475what word it has at that node: 476 477:: 478 479 (lldb) expr dictionary->left->left->right->right->left->word 480 (const char *) $4 = 0x0000000100100880 "Romeo" 481 482So the word we are searching for is "romeo" and the word at our DFS location is 483"Romeo". Aha! One is uppercase and the other is lowercase: We seem to have a 484case conversion problem somewhere in our program (we do). 485 486This is the end of our example on how you might use Python scripting in LLDB to 487help you find bugs in your program. 488 489Source Files for The Example 490---------------------------- 491 492The complete code for the Dictionary program (with case-conversion bug), the 493DFS function and other Python script examples (tree_utils.py) used for this 494example are available below. 495 496tree_utils.py - Example Python functions using LLDB's API, including DFS 497 498:: 499 500 """ 501 # ===-- tree_utils.py ---------------------------------------*- Python -*-===// 502 # 503 # Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. 504 # See https://llvm.org/LICENSE.txt for license information. 505 # SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception 506 # 507 # ===----------------------------------------------------------------------===// 508 509 tree_utils.py - A set of functions for examining binary 510 search trees, based on the example search tree defined in 511 dictionary.c. These functions contain calls to LLDB API 512 functions, and assume that the LLDB Python module has been 513 imported. 514 515 For a thorough explanation of how the DFS function works, and 516 for more information about dictionary.c go to 517 http://lldb.llvm.org/scripting.html 518 """ 519 520 521 def DFS(root, word, cur_path): 522 """ 523 Recursively traverse a binary search tree containing 524 words sorted alphabetically, searching for a particular 525 word in the tree. Also maintains a string representing 526 the path from the root of the tree to the current node. 527 If the word is found in the tree, return the path string. 528 Otherwise return an empty string. 529 530 This function assumes the binary search tree is 531 the one defined in dictionary.c It uses LLDB API 532 functions to examine and traverse the tree nodes. 533 """ 534 535 # Get pointer field values out of node 'root' 536 537 root_word_ptr = root.GetChildMemberWithName("word") 538 left_child_ptr = root.GetChildMemberWithName("left") 539 right_child_ptr = root.GetChildMemberWithName("right") 540 541 # Get the word out of the word pointer and strip off 542 # surrounding quotes (added by call to GetSummary). 543 544 root_word = root_word_ptr.GetSummary() 545 end = len(root_word) - 1 546 if root_word[0] == '"' and root_word[end] == '"': 547 root_word = root_word[1:end] 548 end = len(root_word) - 1 549 if root_word[0] == '\'' and root_word[end] == '\'': 550 root_word = root_word[1:end] 551 552 # Main depth first search 553 554 if root_word == word: 555 return cur_path 556 elif word < root_word: 557 558 # Check to see if left child is NULL 559 560 if left_child_ptr.GetValue() is None: 561 return "" 562 else: 563 cur_path = cur_path + "L" 564 return DFS(left_child_ptr, word, cur_path) 565 else: 566 567 # Check to see if right child is NULL 568 569 if right_child_ptr.GetValue() is None: 570 return "" 571 else: 572 cur_path = cur_path + "R" 573 return DFS(right_child_ptr, word, cur_path) 574 575 576 def tree_size(root): 577 """ 578 Recursively traverse a binary search tree, counting 579 the nodes in the tree. Returns the final count. 580 581 This function assumes the binary search tree is 582 the one defined in dictionary.c It uses LLDB API 583 functions to examine and traverse the tree nodes. 584 """ 585 if (root.GetValue is None): 586 return 0 587 588 if (int(root.GetValue(), 16) == 0): 589 return 0 590 591 left_size = tree_size(root.GetChildAtIndex(1)) 592 right_size = tree_size(root.GetChildAtIndex(2)) 593 594 total_size = left_size + right_size + 1 595 return total_size 596 597 598 def print_tree(root): 599 """ 600 Recursively traverse a binary search tree, printing out 601 the words at the nodes in alphabetical order (the 602 search order for the binary tree). 603 604 This function assumes the binary search tree is 605 the one defined in dictionary.c It uses LLDB API 606 functions to examine and traverse the tree nodes. 607 """ 608 if (root.GetChildAtIndex(1).GetValue() is not None) and ( 609 int(root.GetChildAtIndex(1).GetValue(), 16) != 0): 610 print_tree(root.GetChildAtIndex(1)) 611 612 print root.GetChildAtIndex(0).GetSummary() 613 614 if (root.GetChildAtIndex(2).GetValue() is not None) and ( 615 int(root.GetChildAtIndex(2).GetValue(), 16) != 0): 616 print_tree(root.GetChildAtIndex(2)) 617 618 619dictionary.c - Sample dictionary program, with bug 620 621:: 622 623 //===-- dictionary.c ---------------------------------------------*- C -*-===// 624 // 625 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. 626 // See https://llvm.org/LICENSE.txt for license information. 627 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception 628 // 629 //===----------------------------------------------------------------------===// 630 #include <ctype.h> 631 #include <stdio.h> 632 #include <stdlib.h> 633 #include <string.h> 634 635 typedef struct tree_node { 636 const char *word; 637 struct tree_node *left; 638 struct tree_node *right; 639 } tree_node; 640 641 /* Given a char*, returns a substring that starts at the first 642 alphabet character and ends at the last alphabet character, i.e. it 643 strips off beginning or ending quotes, punctuation, etc. */ 644 645 char *strip(char **word) { 646 char *start = *word; 647 int len = strlen(start); 648 char *end = start + len - 1; 649 650 while ((start < end) && (!isalpha(start[0]))) 651 start++; 652 653 while ((end > start) && (!isalpha(end[0]))) 654 end--; 655 656 if (start > end) 657 return NULL; 658 659 end[1] = '\0'; 660 *word = start; 661 662 return start; 663 } 664 665 /* Given a binary search tree (sorted alphabetically by the word at 666 each node), and a new word, inserts the word at the appropriate 667 place in the tree. */ 668 669 void insert(tree_node *root, char *word) { 670 if (root == NULL) 671 return; 672 673 int compare_value = strcmp(word, root->word); 674 675 if (compare_value == 0) 676 return; 677 678 if (compare_value < 0) { 679 if (root->left != NULL) 680 insert(root->left, word); 681 else { 682 tree_node *new_node = (tree_node *)malloc(sizeof(tree_node)); 683 new_node->word = strdup(word); 684 new_node->left = NULL; 685 new_node->right = NULL; 686 root->left = new_node; 687 } 688 } else { 689 if (root->right != NULL) 690 insert(root->right, word); 691 else { 692 tree_node *new_node = (tree_node *)malloc(sizeof(tree_node)); 693 new_node->word = strdup(word); 694 new_node->left = NULL; 695 new_node->right = NULL; 696 root->right = new_node; 697 } 698 } 699 } 700 701 /* Read in a text file and storea all the words from the file in a 702 binary search tree. */ 703 704 void populate_dictionary(tree_node **dictionary, char *filename) { 705 FILE *in_file; 706 char word[1024]; 707 708 in_file = fopen(filename, "r"); 709 if (in_file) { 710 while (fscanf(in_file, "%s", word) == 1) { 711 char *new_word = (strdup(word)); 712 new_word = strip(&new_word); 713 if (*dictionary == NULL) { 714 tree_node *new_node = (tree_node *)malloc(sizeof(tree_node)); 715 new_node->word = new_word; 716 new_node->left = NULL; 717 new_node->right = NULL; 718 *dictionary = new_node; 719 } else 720 insert(*dictionary, new_word); 721 } 722 } 723 } 724 725 /* Given a binary search tree and a word, search for the word 726 in the binary search tree. */ 727 728 int find_word(tree_node *dictionary, char *word) { 729 if (!word || !dictionary) 730 return 0; 731 732 int compare_value = strcmp(word, dictionary->word); 733 734 if (compare_value == 0) 735 return 1; 736 else if (compare_value < 0) 737 return find_word(dictionary->left, word); 738 else 739 return find_word(dictionary->right, word); 740 } 741 742 /* Print out the words in the binary search tree, in sorted order. */ 743 744 void print_tree(tree_node *dictionary) { 745 if (!dictionary) 746 return; 747 748 if (dictionary->left) 749 print_tree(dictionary->left); 750 751 printf("%s\n", dictionary->word); 752 753 if (dictionary->right) 754 print_tree(dictionary->right); 755 } 756 757 int main(int argc, char **argv) { 758 tree_node *dictionary = NULL; 759 char buffer[1024]; 760 char *filename; 761 int done = 0; 762 763 if (argc == 2) 764 filename = argv[1]; 765 766 if (!filename) 767 return -1; 768 769 populate_dictionary(&dictionary, filename); 770 fprintf(stdout, "Dictionary loaded.\nEnter search word: "); 771 while (!done && fgets(buffer, sizeof(buffer), stdin)) { 772 char *word = buffer; 773 int len = strlen(word); 774 int i; 775 776 for (i = 0; i < len; ++i) 777 word[i] = tolower(word[i]); 778 779 if ((len > 0) && (word[len - 1] == '\n')) { 780 word[len - 1] = '\0'; 781 len = len - 1; 782 } 783 784 if (find_word(dictionary, word)) 785 fprintf(stdout, "Yes!\n"); 786 else 787 fprintf(stdout, "No!\n"); 788 789 fprintf(stdout, "Enter search word: "); 790 } 791 792 fprintf(stdout, "\n"); 793 return 0; 794 } 795 796 797The text for "Romeo and Juliet" can be obtained from the Gutenberg Project 798(http://www.gutenberg.org). 799 800