1Lognormalizer 2============= 3 4Lognormalizer is a sample tool which is often used to test and debug 5rulebases before real use. Nevertheless, it can be used in production as 6a simple command line interface to liblognorm. 7 8This tool reads log lines from its standard input and prints results 9to standard output. You need to use redirections if you want to read 10or write files. 11 12An example of the command:: 13 14 $ lognormalizer -r messages.sampdb -e json <messages.log 15 16Command line options 17-------------------- 18 19:: 20 21 -V 22 23Output version information, including information about the installed 24version of liblognorm and its optional features. So this may also be 25used to check the currently installed library version. 26 27:: 28 29 -r <FILENAME> 30 31Specifies name of the file containing the rulebase. 32 33:: 34 35 -v 36 37Increase verbosity level. Can be used several times. If used three 38times, internal data structures are dumped (make sense to developers, 39only). 40 41:: 42 43 -p 44 45Print only successfully parsed messages. 46 47:: 48 49 -P 50 51Print only messages **not** successfully parsed. 52 53:: 54 55 -L 56 57Add line number information to events not successfully parsed. This 58is meant as a troubleshooting aid when working with unparsable events, 59as the information can be used to directly go to the line in question 60in the source data file. The line number is contained in a field 61named ``lognormalizer.line_nbr``. 62 63:: 64 65 -t <TAG> 66 67Print only those messages which have this tag. 68 69:: 70 71 -T 72 73Include 'event.tags' attribute when output is in JSON format. This attribute contains list of tags of the matched 74rule. 75 76:: 77 78 -E <DATA> 79 80Encoder-specific data. For CSV, it is the list of fields to be output, 81separated by comma or space. It is currently unused for other formats. 82 83:: 84 85 -d <FILENAME> 86 87Generate DOT file describing parse tree. It is used to plot parse graph 88with GraphViz. 89 90:: 91 92 -H 93 94At end of run, print a summary line with number of messages processed, 95parsed and unparsed to stdout. 96 97:: 98 99 -U 100 101At end of run, print a summary line with number of messages unparsed to 102stdout. Note that this message is only printed if there was at least one 103unparsable message. 104 105:: 106 107 -o 108 109Special options. The following ones can be set: 110 111 * **allowRegex** Permits to use regular expressions inse the v1 engine 112 This is deprecated and should not be used for new deployments. 113 114 * **addExecPath** Includes metadata into the event on how it was 115 (tried) to be parsed. Can be useful in troubleshooting normalization 116 problems. 117 118 * **addOriginalMsg** Always add the "original-msg" data item. By 119 default, this is only done when a message could not be parsed. 120 121 * **addRule** Add a mockup of the rule that was processed. Note that 122 it is *not* an exact copy of the rule, but a rule that correctly 123 describes the parsed message. Most importantly, prefixes are 124 appended and custom data types are expanded (and no longer visiable 125 as such). This option is primarily meant for postprocessing, e.g. 126 as input to an anonymizer. 127 128 * **addRuleRulcation** For rules that successfully parsed, add the 129 location of the rule inside the rulebase. But the file name as 130 well as the line number are given. If two rules evaluate to the same 131 end node, only a single rule location is given. However, in 132 practice this is extremely unlikely and as such for practical 133 reasons the information can be considered reliable. 134 135:: 136 137 -s <FILENAME> 138 139At end of run, print internal parse DAG statistics and exit. This 140option is meant for developers and researches which want to get insight 141into the quality of the algorithm and/or how efficient the rulebase could 142be processed. **NOT** intended for end users. This option is performance 143intense. 144 145:: 146 147 -S <FILENAME> 148 149Even stronger statistics than -s. Requires that the version is compiled 150with --enable-advanced-statistics, which causes a considerable 151performance loss. 152 153:: 154 155 -x <FILENAME> 156 157Print statistics as a DOT file. In order to keep the graph readable, 158information is only emitted for called nodes. 159 160:: 161 162 -e <json|xml|csv|raw|cee-syslog> 163 164Output format. By default, output is in JSON format. With this option, 165you can change it to a different one. 166 167Supported Output Formats 168........................ 169The JSON, XML, and CSV formats should be self-explanatory. 170 171The cee-syslog format emits messages according to the Mitre CEE spec. 172Note that the cee-syslog format is primarily supported for 173backward-compatibility. It does **not** support nested data items 174and as such cannot be used when the rulebase makes use of this 175feature (we assume this most often happens nowadays). We strongly 176recommend not use it for new deployments. Support may be removed 177in later releases. 178 179The raw format outputs an exact copy of the input message, without 180any normalization visible. The prime use case of "raw" is to extract 181either all messages that could or could not be normalized. To do so 182specify the -p or -P option. Also, it works in combination with the 183-t option to extract a subset based on tagging. In any case, the core 184use is to prepare a subset of the original file for further processing. 185 186Examples 187-------- 188 189These examples were created using sample rulebase from source package. 190 191Default (CEE) output:: 192 193 $ lognormalizer -r rulebases/sample.rulebase 194 Weight: 42kg 195 [cee@115 event.tags="tag2" unit="kg" N="42" fat="free"] 196 Snow White and the Seven Dwarfs 197 [cee@115 event.tags="tale" company="the Seven Dwarfs"] 198 2012-10-11 src=127.0.0.1 dst=88.111.222.19 199 [cee@115 dst="88.111.222.19" src="127.0.0.1" date="2012-10-11"] 200 201JSON output, flat tags enabled:: 202 203 $ lognormalizer -r rulebases/sample.rulebase -e json -T 204 %% 205 { "event.tags": [ "tag3", "percent" ], "percent": "100", "part": "wha", "whole": "whale" } 206 Weight: 42kg 207 { "unit": "kg", "N": "42", "event.tags": [ "tag2" ], "fat": "free" } 208 209CSV output with fixed field list:: 210 211 $ lognormalizer -r rulebases/sample.rulebase -e csv -E'N unit' 212 Weight: 42kg 213 "42","kg" 214 Weight: 115lbs 215 "115","lbs" 216 Anything not matching the rule 217 , 218 219Creating a graph of the rulebase 220-------------------------------- 221 222To get a better overview of a rulebase you can create a graph that shows you 223the chain of normalization (parse-tree). 224 225At first you have to install an additional package called graphviz. Graphviz 226is a tool that creates such a graph with the help of a control file (created 227with the rulebase). `Here <http://www.graphviz.org/>`_ you will find more 228information about graphviz. 229 230To install it you can use the package manager. For example, on RedHat 231systems it is yum command:: 232 233 $ sudo yum install graphviz 234 235The next step would be creating the control file for graphviz. Therefore we 236use the normalizer command with the options -d "prefered filename for the 237control file" and -r "rulebase":: 238 239 $ lognormalize -d control.dot -r messages.rb 240 241Please note that there is no need for an input or output file. 242If you have a look at the control file now you will see that the content is 243a little bit confusing, but it includes all information, like the nodes, 244fields and parser, that graphviz needs to create the graph. Of course you 245can edit that file, but please note that it is a lot of work. 246 247Now we can create the graph by typing:: 248 249 $ dot control.dot -Tpng >graph.png 250 251dot + name of control file + option -T -> file format + output file 252 253That is just one example for using graphviz, of course you can do many 254other great things with it. But I think this "simple" graph could be very 255helpful for the normalizer. 256 257Below you see sample for such a graph, but please note that this is 258not such a pretty one. Such a graph can grow very fast by editing your 259rulebase. 260 261.. figure:: graph.png 262 :width: 90 % 263 :alt: graph sample 264 265