1Debugging 2========= 3 4There link-grammar library has API calls to ease debugging and development. 5The `link-parser` program has corresponding options for this API. 6 7Only the `link-parser` options will be discussed here. 8Options to `link-parser` at the command-line are preceded with a `-` sign. 9You can use a unique prefix of an option name instead of its full name. At 10the **linkparser>** prompt or in batch files, options are preceded with 11a `!` character. 12 13For info on common options, see the "Special ! options" of the `link-grammar` 14manual. For a general help message use `link-parser -help`. 15 16 17Debug options 18------------- 19 20### 1) -verbosity=N (-v=N) 21Sets the verbosity level of the library to N (a small non-negative integer). 22 23#### Verbosity levels 240: Certain informative messages are not printed by the 25library. `link-parser` also doesn't print its usual **linkparser>** 26prompt. This is the current default verbosity level for the Python 27binding. 28 291: This is the library default. This is also the default for 30`link-parser`. 31 322: Display parsing steps time. In case an error/warning gets issued by the 33library, this may help finding out at which step it happened. 34 353: Display some more Info messages: 36- Freeing dictionaries. 37- Number of insane-morphism linkages. 38- A warning when all the linkages have a PP violation. 39 404: Display data file search and locale setup. It can be used to debug 41problems with the locale setup or in finding the dictionary. 42 435-9: Show trace and debug messages regarding sentence handling. Higher 44levels include the messages of the lower ones. 45 4610-99: Show also trace and debug messages regarding reading the 47dictionary. As with levels greater then 4, higher levels include the 48messages of the lower ones. 49 50* 10: Basic dictionary debug. 51 52100-...: Show only messages exactly at the specified level. 53* 101: Print all the connectors, along with their length limit. 54 A length limit of 0 means the value of the `short_length` option is 55 used. 56 57* 102: Print all disjuncts before and after pruning. 58 59* 103: Show unsubscripted dictionary words and subscripted ones which share 60 the same base word. 61 62* 104: Memory pool statistics. 63 64### 2) -debug=LOCATIONS (-de=LOCATIONS) 65Show only messages from these LOCATIONS. The LOCATIONS string is a 66comma-separated list of source file names (without specifying their 67directory) and function names (fully qualified for C++) from which to 68show the messages. 69 70For example, to only show messages from the `flatten_wordgraph()` function 71or the print.c file: 72 73`link-parser -v=6 -debug=flatten_wordgraph,print.c` 74 75Note that since print.c is used to produce certain messages, it is 76currently needed to add it to the debug LOCATIONS list unless you 77explicitly specify also the function in print.c (to further restrict 78the messages). 79 80### 3) -test=FEATURES (-te=FEATURES) 81Enable certain features. These can be debug aids, or new features that 82are not yet official or fully-developed. 83 84For example, to automatically show all linkages of a sentence, the 85following can be done: 86 87`link-parser -test=auto-next-linkage` 88 89`link-parser` warns when tests are enabled. This way it is possible to see in 90the linkage output which tests were enabled. This is particularly important 91when examining output files. However, when doing benchmarks (with and w/o 92tests) this is not desired because these warnings skew the timing. 93If needed, suppress this warnings with the added special tests `@`, as in: 94`-test=@,one-step-parse`. 95 96Useful examples 97--------------- 98 99### -debug=... 100 1011) See the tokens after flattening into the word array used by the parser: 102 103``` 104echo "Let's test it" | \ 105link-parser -v=6 -debug=flatten_wordgraph,print_sentence_word_alternatives 106``` 107 1082) Trace the work of `sane_linkage_morphism()`: 109 110`link-parser -v=8 -debug=sane_linkage_morphism` 111 1123) Same as (2) above, but also see other messages from sane.c: 113 114`link-parser -v=8 -debug=sane.c` 115 116(`sane_linkage_morphism()` happens to be in `sane.c` so this includes its 117messages.) 118 1194) Debug the tokenizer: 120 121`link-parser -v=7 -debug=tokenizer.c` 122 123Or, in order to display the word array: 124 125`link-parser -v=7 -debug=tokenize.c,print_sentence_word_alternatives` 126 1275) Debug post-processing: 128 129`link-parser -v=9 -debug=post-process.c` 130 1316) Debug expression pruning: 132 133`link-parser -v=9 -debug=expression_prune` 134 1357) Debug reading the affix and knowledge files: 136 137`link-parser -v=11` 138 139### -test=... 140 1411) Automatically show all linkages: 142 143`link-parser -test=auto-next-linkage` 144Try to type some sentences at the **linkparser>** prompt to see its action. 145 1462) Print more that 1024 linkages in `link-parser` (this is the maximum 147`link-parser` would print by default), e.g. 20000: 148 149`link-parser -test=auto-next-linkage:20000` 150 1513) To print detailed linkages of **data/en/corpus-basic.batch**: 152 153``` 154sed '/^*/d;/^!const/d;/^!batch/d' data/en/corpus-basic.batch | \ 155link-parser -test=auto-next-linkage 156``` 157 158(If you cut&paste it to a terminal, remember to escape each of the "**!**" 159characters with a backslash.) 160 161This, along with "diff", "grep" etc., can be used in order to validate 162that a change didn't cause undesired effects. Special care should be taken 163if sentences with more than 1024 linkages are to be verified too (use a 164larger `-limit=N` and `-test=auto-next-linkage:M`, when N>>M). 165 166Note that this technique is not very effective if the order to the 167linkages got changed (or if SAT-parser linkages need to be compared to the 168classic-parser linkages). In that case the detailed linkages results need 169to be filtered through a script which sorts them according to some 170"canonical order" and also removes duplicates. 171 1724) Display the wordgraph using `-wordgraph=N`, optionally using additional 173wordgraph-display flags with `-test=wg:FLAGS`. 174 175For more examples of how to use the wordgraph-display, see 176[link-grammar/tokenize/README.md] 177(/link-grammar/tokenize/README.md#word-graph-display) 178and [msvc/README.md](/msvc/README.md). 179 1805) Test the "trailing connector" hashing for short sentences too (e.g. for 181all sentences with more than 10 tokens): 182`link-parser test=min-len-encoding:10` 183Or optionally (in order to see relevant debug messages from `preparation.c`): 184`link-parser test=min-len-encoding:10 -v=5 -debug=preparation.c` 185 1866) -test=<values> for SAT parser debugging: 187`linkage-disconnected` - Display also solutions which don't have a full linkage. 188`sat-stats` - Display the number of PP-violations and disconnected linkages. 189`no-pp_pruning_1` - Disable a partial CONTAINS_NONE_RULES pruning 190 1917) -test=<values> for the pruning subsystem: 192`len-multi-pruning:N` - Prune per null_count for more than N-token sentences. 193`always-parse` - Don't use a parse shortcut and always fully prune. 194`no-mlink` - Don't prune using an mlink table. 195 196Debugging and STDIO streams 197--------------------------- 198Messages at severity Info and higher (i.e. also Warning, Error and 199Fatal) are printed to `stderr`. The other severities 200(at Debug and below, i.e also 201Trace and None) are printed to `stdout`. The rational is that 202debugging messages, in order to be useful, need to appear along with the 203regular output of the program, while errors are exceptional and need to 204stand out when `link-parser`s `stdout` is redirected to a file. 205 206The C API includes the ability to set the severity level threshold above 207which messages are printed to `stderr` (see 208"Improved error notification facility"->"C API" in 209[link-grammar/README.md](/link-grammar/README.md)). 210 211Note that when debugging errors during a sentence batch run, it may be useful 212to redirect also `stderr` to the same file (the error facility of the library 213flushes `stdout` before printing in order to preserve output order). 214 215Using debugger 216-------------- 217 218### Configuring for debug 219 220`configure --enable-debug` 221 222Its sets the DEBUG definitions and removes the optimization flags of the 223compiler. The DEBUG definition adds various validity checks, test 224messages, and some debug functions (that can be invoked, for example, from 225the debugger). 226 227 228| `gdb` command | Description | 229|---------------|-------------| 230| <pre>call wordgraph_show(sent, "") | If something goes wrong, it is may be useful to display the wordgraph. The second argument can include wordgraph display options.| 231| <pre>call print_all_disjuncts(sent) | Print the disjuncts. | 232 233 234FIXME: Document more debug functions. 235 236Compilation definitions 237----------------------- 238Some debug-related compilation flags can be set using `configure`, to as `make` 239arguments: 240 241| Definition | Description | 242| ---------- |-------------| 243| `NO_SAN_DICT` | Don't use ASAN/UBSAN for dict reading. This cause a vast startup speedup when using ASAN/UBSAN. It is optional because it shouldn't normally be used when debugging the dictionary code. | 244|`POOL_ALLOCATOR=0` | Pool allocator debug facility: A fake pool allocator that uses `malloc()` for each allocation is defined, in order that ASAN or valgrind can be used to find memory usage bugs. | 245|`TRACON_SET_DEBUG` | Print tracon_set stats. | 246|`DEBUG_PP_PRUNE` | PP pruning debug printout. | 247|`DEBUG_TABLE_STAT`| print count table stats. | 248|`DO_COUNT_TRACE` | Detailed trace of do_count. | 249|`DEBUG_X_TABLE` | Print x_table stats. | 250 251### Specific SAT-parser debug 252| Definition | Description | 253| ---------- |-------------| 254| `CONNECTIVITY_DEBUG` | Debug SAT connectivity . | 255| `SAT_DEBUG`, `VARS` | Debug variables. | 256