1<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>bogoutil</title><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="refentry"><a name="bogoutil.1"></a><div class="titlepage"></div><div class="refnamediv"><a name="name"></a><h2>Name</h2><p>bogoutil &#8212; Dumps, loads, and maintains
2	<span class="application">bogofilter</span> database files</p></div><div class="refsynopsisdiv"><a name="synopsis"></a><h2>Synopsis</h2><div class="cmdsynopsis"><p><code class="command">bogoutil</code>  { -h  |   -V }</p></div><div class="cmdsynopsis"><p><code class="command">bogoutil</code>  [options] { -d <em class="replaceable"><code>file</code></em>  |   -H <em class="replaceable"><code>file</code></em>  |   -l <em class="replaceable"><code>file</code></em>  |   -m <em class="replaceable"><code>file</code></em>  |   -w <em class="replaceable"><code>file</code></em>  |   -p <em class="replaceable"><code>file</code></em> }</p></div><div class="cmdsynopsis"><p><code class="command">bogoutil</code>  { -r <em class="replaceable"><code>file</code></em>  |   -R <em class="replaceable"><code>file</code></em> }</p></div><div class="cmdsynopsis"><p><code class="command">bogoutil</code>  { --db-print-leafpage-count <em class="replaceable"><code>file</code></em>  |   --db-print-pagesize <em class="replaceable"><code>file</code></em>  |   --db-verify <em class="replaceable"><code>file</code></em>  |   --db-checkpoint
3		    <em class="replaceable"><code>directory</code></em> [flag...]  |   --db-list-logfiles <em class="replaceable"><code>directory</code></em>  |   --db-prune <em class="replaceable"><code>directory</code></em>  |   --db-recover <em class="replaceable"><code>directory</code></em>  |   --db-recover-harder <em class="replaceable"><code>directory</code></em>  |   --db-remove-environment <em class="replaceable"><code>directory</code></em> }</p></div><p>where <code class="option">options</code> is</p><div class="cmdsynopsis"><p><code class="command">bogoutil</code>  [-v] [-n] [-C] [-D] [-a <em class="replaceable"><code>age</code></em>] [-c <em class="replaceable"><code>count</code></em>] [-s <em class="replaceable"><code>min,max</code></em>] [-y <em class="replaceable"><code>date</code></em>] [-I <em class="replaceable"><code>file</code></em>] [-O <em class="replaceable"><code>file</code></em>] [-x <em class="replaceable"><code>flags</code></em>] [--config-file <em class="replaceable"><code>file</code></em>]</p></div></div><div class="refsect1"><a name="description"></a><h2>DESCRIPTION</h2><p><span class="application">Bogoutil</span> is part of the
4	    <span class="application">bogofilter</span> Bayesian spam filter package.</p><p>It is used to dump and load <span class="application">bogofilter</span>'s
5	    Berkeley DB databases to and from text files, perform database maintenance
6	    functions, and to display the values for specific words.</p></div><div class="refsect1"><a name="options"></a><h2>OPTIONS</h2><p>
7	    The <code class="option">-d <em class="replaceable"><code>file</code></em></code>
8	    option tells <span class="application">bogoutil</span> to print
9	    the contents of the database file to <code class="option">stdout</code>.
10	</p><p>
11	    The <code class="option">-H <em class="replaceable"><code>file</code></em></code>
12	    option tells <span class="application">bogoutil</span> to print
13	    a histogram of the database file to
14	    <code class="option">stdout</code>.  The output is similar to
15	    <span class="application">bogofilter -vv</span>. Finally,
16	    hapaxes (tokens which were only seen once) and pure tokens
17	    (tokens which were encountered only in ham or only in
18	    spam) are counted.
19	</p><p>
20	    The <code class="option">-l <em class="replaceable"><code>file</code></em></code>
21	    option tells <span class="application">bogoutil</span>
22	    to load the data from <code class="option">stdin</code> into the database file.
23	    If the database file exists, <code class="option">stdin</code> data is
24	    merged into the database file, with counts added up.
25	</p><p>The <code class="option">-m</code> option tells <span class="application">bogoutil</span>
26	    to perform maintenance functions on the specified database, i.e. discard tokens
27	    that are older than desired, have counts that are too small, or sizes (lengths)
28	    that are too long or too short.
29	</p><p>
30	    The <code class="option">-w <em class="replaceable"><code>file</code></em></code>
31	    option tells <span class="application">bogoutil</span> to
32	    display token information from the database file.  The option
33	    takes an argument, which is either the name of the
34	    wordlist (usually wordlist.db) or the name of the directory
35	    containing it.  Tokens can be listed on the command line
36	    or piped to <span class="application">bogoutil</span>.  When
37	    there are extra arguments on the command line,
38	    <span class="application">bogoutil</span> will use them as the
39	    tokens to lookup.  If there are no extra arguments,
40	    <span class="application">bogoutil</span> will read tokens from
41	    <code class="option">stdin</code>.
42	</p><p>
43	    The <code class="option">-p <em class="replaceable"><code>file</code></em></code>
44	    option tells <span class="application">bogoutil</span> to
45	    display the database information for one or more tokens.
46	    The display includes a probability column with the
47	    token's spam score (computed using
48	    <span class="application">bogofilter</span>'s default values).
49	    Option <code class="option">-p</code> takes the same arguments as
50	    option <code class="option">-w</code> .
51	</p><p>The <code class="option">-r <em class="replaceable"><code>file</code></em></code> option tells
52	    <span class="application">bogoutil</span> to recalculate the ROBX
53	    value and print it as a six-digit fraction.
54	</p><p>The <code class="option">-R <em class="replaceable"><code>file</code></em></code>
55	    option does the same as <code class="option">-r</code>, but saves the
56	    result in the training database without printing it.
57	</p><p>The <code class="option">-I <em class="replaceable"><code>file</code></em></code> option tells
58	    <span class="application">bogoutil</span> to read its input from
59	    <em class="replaceable"><code>file</code></em> rather than stdin.
60	</p><p>The <code class="option">-O <em class="replaceable"><code>file</code></em></code> option tells
61	    <span class="application">bogoutil</span> to write its output to
62	    <em class="replaceable"><code>file</code></em> rather than stdout.
63	</p><p>
64	    The <code class="option">-v</code> option produces verbose output on <code class="option">stderr</code>.
65	    This option is primarily useful for debugging.
66	</p><p>The <code class="option">-C</code> inhibits reading configuration
67	    files and lets <span class="application">bogoutil</span> go with the defaults.</p><p>The <code class="option">--config-file
68		<em class="replaceable"><code>file</code></em></code> option tells
69	    <span class="application">bogoutil</span> to read <em class="replaceable"><code>file</code></em>
70	    instead of the standard configuration file.</p><p>The <code class="option">-D</code> redirects debug output to stdout (it
71	    usually goes to stderr).</p><p>The <code class="option">-x <em class="replaceable"><code>flags</code></em></code>
72	    option sets debugging flags.</p><p>
73	    Option <code class="option">-n</code> stands for "replace non-ascii characters".
74	    It will replace characters with the high bit (0x80) by question marks.
75	    This can be useful if a word list has lots of unreadable tokens, for
76	    example from Asian spam.  The "bad" characters will be converted to
77	    question marks and matching tokens will be combined when used with
78	    <code class="option">-m</code> or <code class="option">-l</code>, but not with <code class="option">-d</code>.
79	</p><p>
80	    Option <code class="option">-a age</code> indicates an acceptable token age, with older ones being discarded.
81	    The age can be a date (in form YYYYMMMDD) or a day count, i.e. discard tokens older than
82	    <code class="option">age</code> days.
83	</p><p>
84	    Option <code class="option">-c value</code> indicates that tokens with counts less than or equal to <code class="option">value</code>
85	    are to be discarded.
86	</p><p>
87	    Option <code class="option">-s min,max</code> is used to discard tokens based on their size, i.e. length.
88	    All tokens shorter than <code class="option">min</code> or longer than <code class="option">max</code> will be discarded.
89	</p><p>
90	    Option <code class="option">-y date</code> is specifies the date to
91	give to tokens that don't have dates.  The format is YYYYMMDD.
92	</p><p>The <code class="option">-h</code> option prints the help message and exits.</p><p>The <code class="option">-V</code> option prints the version number and exits.</p></div><div class="refsect1"><a name="environment_maintenance"></a><h2>ENVIRONMENT MAINTENANCE</h2><p>The <code class="option">--db-checkpoint <em class="replaceable"><code>dir</code></em></code>
93	    option causes <span class="application">bogoutil</span> to flush the buffer
94	    caches and checkpoint the database environment.</p><p>The <code class="option">--db-list-logfiles
95		<em class="replaceable"><code>dir</code></em></code>
96	    option causes <span class="application">bogoutil</span> to list the log
97	    files in the environment.  Zero or more keywords can be added or
98	    combined (separated by whitespace) to modify the behavior of this
99	    mode. The default behavior is to list only inactive log
100	    files with relative paths. You can add <code class="option">all</code>
101	    to list all log files (inactive and active). You can add
102	    <code class="option">absolute</code> to switch the listing to absolute
103	    paths.
104	</p><p>The <code class="option">--db-prune <em class="replaceable"><code>dir</code></em></code>
105	    option causes <span class="application">bogoutil</span> to checkpoint
106	    the database environment and remove inactive log files.</p><p>The <code class="option">--db-recover <em class="replaceable"><code>dir</code></em></code>
107	    option runs a regular database recovery
108	    in the specified database directory. If that fails, it will retry
109	    with a (usually slower) catastrophic database recovery. If
110	    that fails, too, your database cannot be repaired and must
111	    be rebuilt from scratch.
112	    This is only supported when compiled with Berkeley DB
113	    support with transactions enabled. Trying recovery with QDBM or SQLite3 support will
114	    result in an error.</p><p>The <code class="option">--db-recover-harder <em class="replaceable"><code>dir</code></em></code>
115	    option runs a catastrophic data
116	    base recovery in the specified database directory. If that fails,
117	    your database cannot be repaired and must be rebuilt from
118	    scratch.
119	    This is only supported when compiled with Berkeley DB
120	    support with transactions enabled. Trying recovery with QDBM or SQLite3 support will
121	    result in an error.</p><p>The <code class="option">--db-remove-environment
122		<em class="replaceable"><code>directory</code></em></code> option has
123	    no short option equivalent. It runs recovery in the given
124	    directory and then removes the database environment. Use
125	    this <span class="emphasis"><em>before</em></span> upgrading to a new Berkeley
126	    DB version if the new version to be installed requires a log
127	    file format update.</p><p>The <code class="option">--db-print-leafpage-count
128		<em class="replaceable"><code>file</code></em></code> option prints
129	    the number of leaf pages in the database file
130	    <em class="replaceable"><code>file</code></em> as a decimal number, or
131	    UNKNOWN if the database does not support querying this
132	    figure.</p><p>The <code class="option">--db-print-pagesize
133		<em class="replaceable"><code>file</code></em></code> option prints
134	    the size of a database page in
135	    <em class="replaceable"><code>file</code></em> as a decimal number, or
136	    UNKNOWN for databases with variable page size or databases
137	    that do not allow a query of the database page size.</p><p>
138	    The <code class="option">--db-verify <em class="replaceable"><code>file</code></em></code>
139	    option requests that <span class="application">bogofilter</span> verifies
140	    the database file.  It prints only errors, unless in verbose mode.
141	</p></div><div class="refsect1"><a name="dataformat"></a><h2>DATA FORMAT</h2><p>
142	    <span class="application">Bogoutil</span> reads and writes text files where each nonblank
143	    line consists of a word, any amount of horizontal whitespace, a numeric word count,
144	    more whitespace, and (optionally) a date in form YYYYMMDD.
145	    Blank lines are skipped.
146	</p></div><div class="refsect1"><a name="returns"></a><h2>RETURN VALUES</h2><p>
147	    0 for successful operation.
148	    1 for most errors.
149	    3 for I/O or other errors.
150	    Error 3 usually means that something is seriously wrong with the database files.
151	</p></div><div class="refsect1"><a name="author"></a><h2>AUTHOR</h2><p>Gyepi Sam <code class="email">&lt;<a class="email" href="mailto:gyepi@praxis-sw.com">gyepi@praxis-sw.com</a>&gt;</code>.</p><p>Matthias Andree <code class="email">&lt;<a class="email" href="mailto:matthias.andree@gmx.de">matthias.andree@gmx.de</a>&gt;</code>.</p><p>David Relson <code class="email">&lt;<a class="email" href="mailto:relson@osagesoftware.com">relson@osagesoftware.com</a>&gt;</code>.</p><p>
152      For updates, see <a class="ulink" href="http://bogofilter.sourceforge.net/" target="_top">
153	  the bogofilter project page</a>.
154  </p></div><div class="refsect1"><a name="also"></a><h2>SEE ALSO </h2><p>bogofilter(1), bogolexer(1), bogotune(1), bogoupgrade(1)</p></div></div></body></html>
155