• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

lang/H26-Aug-2013-25,56024,076

CHANGESH A D27-Feb-201324.4 KiB587430

COPYINGH A D21-Mar-200817.6 KiB340281

CopyrightH A D26-Feb-2013959 2115

DNS.READMEH A D30-Jun-200814.1 KiB296237

INSTALLH A D10-Jul-200814.3 KiB341268

Makefile.inH A D27-Feb-20134 KiB11779

Makefile.stdH A D26-Feb-20134.3 KiB14157

READMEH A D26-Feb-2013100.2 KiB1,9501,615

README.FIRSTH A D09-Oct-2010988 2216

configureH A D09-Oct-2010209.6 KiB7,5106,323

configure.inH A D26-Feb-201310.1 KiB323266

dns_resolv.cH A D03-May-202227.6 KiB998709

dns_resolv.hH A D30-Jun-20081.3 KiB4333

graphs.cH A D26-Feb-201330.5 KiB904662

graphs.hH A D01-Jun-2008414 118

hashtab.cH A D26-Aug-201330.4 KiB1,255952

hashtab.hH A D22-May-20084.7 KiB10891

install-shH A D21-Mar-20085.5 KiB251152

lang.hH A D01-Jul-20084.2 KiB183157

linklist.cH A D27-Feb-20138.4 KiB306203

linklist.hH A D01-Jul-20082.4 KiB5145

output.cH A D03-May-2022106.2 KiB2,9322,334

output.hH A D30-Jun-2008390 1310

parser.cH A D27-Feb-201321.2 KiB741516

parser.hH A D21-Mar-200896 74

preserve.cH A D26-Feb-201326.2 KiB860615

preserve.hH A D01-Jun-20081.4 KiB2922

sample.confH A D26-Aug-201331.4 KiB7809

wcmgr.1H A D26-Feb-20134.7 KiB111109

wcmgr.cH A D03-May-202233.7 KiB1,207884

webalizer.1H A D26-Feb-201335 KiB906904

webalizer.LSMH A D30-Jun-20081.4 KiB2726

webalizer.cH A D03-May-202298.5 KiB2,4921,931

webalizer.hH A D26-Feb-201315 KiB303260

DNS.README

1The Webalizer - A log file analysis program  -- DNS information
2
3The webalizer has the ability to perform reverse DNS lookups,  and
4fully supports both IPv4 and IPv6 addressing schemes.  This document
5attempts to explain how it works, and some things that you should be
6aware of when using the DNS lookup features.
7
8Note: The Reverse DNS feature may be enabled or disabled at compile
9      time.  DNS lookup code is enabled by default.  You can run The
10      Webalizer using the '-vV' command line options to determine what
11      options are enabled in the version you are using.
12
13
14How it works
15------------
16
17DNS lookups are made against a DNS cache file containing IP addresses
18and resolved names.  If the IP address is not found in the cache file,
19it will be left as an IP address.  In order for this to happen, a
20cache file MUST be specified when the Webalizer is run, either using
21the '-D' command line switch, or a "DNSCache" configuration file
22keyword.  If no cache file is specified, no attempts to perform DNS
23lookups will be done. The cache file can be made three different ways.
24
251) You can have the Webalizer pre-process the specified log file at
26   run-time, creating the cache file before processing the log file
27   normally.  This is done by setting the number of DNS Children
28   processes to run, either by using the '-N' command line switch or
29   the "DNSChildren" configuration keyword.  This will cause the
30   Webalizer to spawn the specified number of processes which will
31   be used to do reverse DNS lookups.. generally, a larger number
32   of processes will result in faster resolution of the log, however
33   if set too high may cause overall system degradation.  A setting
34   of between 5 and 20 should be acceptable, and there is a maximum
35   limit of 100.   If used, a cache filename MUST be specified also,
36   using either the '-D' command line switch, or the "DNSCache"
37   configuration keyword.  Using this method, normal processing will
38   continue only after all IP addresses have been processed, and the
39   cache file is created/updated.
40
412) You can pre-process the log file as a standalone process, creating
42   the cache file that will be used later by the Webalizer.  This is
43   done by running the Webalizer with a name of 'webazolver' (ie: the
44   name 'webazolver' is a symbolic link to 'webalizer') and specifying
45   the cache filename (either with '-D' or DNSCache).   If the number
46   of child processes is not given, the default of 5 will be used. In
47   this mode, the log will be read and processed, creating a DNS cache
48   file or updating an existing one, and the program will then exit
49   without any further processing.
50
513) You can use The Webalizer (DNS) Cache file Manager program 'wcmgr'
52   to create and manipulate a cache file.  A blank cache file can be
53   created which would be later populated, or data for the cache file
54   can be imported using tab delimited text files.  See the wcmgr(1)
55   man page for usage information.
56
57
58Run-time DNS cache file creation/update
59---------------------------------------
60
61The creation/update of a DNS cache file at run-time occurs as follows:
62
631) The log file is read, creating a list of all IP addresses that are
64   not already cached (or cached but expired) and need to be resolved.
65   Addresses are expired based on the TTL value specified using the
66   'CacheTTL' configuration option or after 7 days (default) if no TTL
67   is specified.
68
692) The specified number of children processes are forked, and are used
70   to perform DNS lookups.
71
723) Each IP address is given, one at a time, to the next available child
73   process until all IP addresses have been processed.  Each child will
74   update the cache file when a result is returned.  This may be either
75   a resolved name or a failed lookup, in which case the address will be
76   left unresolved.  Unresolved addresses are not normally cached, but
77   can be, if enabled using the 'CacheIPs' configuration file keyword.
78
794) Once all IP addresses have been processed and the cache file updated,
80   the Webalizer will process the log normally.  Each record it finds
81   that has an unresolved IP address will be looked up in the cache file
82   to see if a hostname is available (ie: was previously found).
83
84Because there may be a significant amount of time between the initial
85unresolved IP list and normal processing, the Webalizer should not be
86run against live log files (ie: a log file that is actively being written
87to by a server), otherwise there may be additional records present that
88were not resolved.
89
90
91Stand-Alone DNS cache file creation/update
92------------------------------------------
93
94The creation/update of the DNS cache file, when run in stand-alone mode,
95occurs as follows:
96
971) The log file is read, creating a list of all IP addresses that are
98   not already cached (or cached but expired) and need to be resolved.
99
1002) The specified number of children processes are forked, and are used
101   to perform DNS lookups. If the number of processes was not specified,
102   the default of 5 will be used.
103
1043) Each IP address is given, one at a time, to the next available child
105   process until all IP addresses have been processed.  Each child will
106   update the cache file when a result is returned.
107
1084) Once all IP addresses have been processed and the cache file updated,
109   the program will terminate without any further processing.
110
111
112Larger sites may prefer to use a stand-alone process to create the DNS
113cache file, and then run the Webalizer against the cache file.  This
114allows a single cache file to be used for many virtual hosts, and reduces
115the processing needed if many sites are being processed.  The Webalizer
116can be used in stand alone mode by running it as 'webazolver'.  When
117run in this fashion, it will only create the cache file and then exit
118without any further processing.  A cache filename MUST be specified,
119however unlike when running the Webalizer normally, the number of child
120processes does not have to be given (will default to 5).  All normal
121configuration and command line options are recognized, however, many
122of them will simply be ignored.. this allows the use of a standard
123configuration file for both normal use and stand alone use.
124
125
126Examples:
127---------
128
129webalizer -c test.conf -N 10 -D dns_cache.db /var/log/my_www_log
130
131   This will use the configuration file 'test.conf' to obtain normal
132   configuration options such as hostname and output directory.. it
133   will then either create or update the file 'dns_cache.db' in the
134   default output directory (using 10 child processes) based on the
135   IP addresses it finds in the log /var/lib/my_www_log, and then
136   process that log file normally.
137
138
139webalizer -o out -D dns_cache.db /var/log/my_www_log
140
141  This will process the log file /var/log/my_www_log, resolving IP
142  addresses from the cache file 'dns_cache.db' found in the default
143  output directory "out".  The cache file must be present as it will
144  not be created with this command.
145
146
147for i in /var/log/*/access_log; do
148  webazolver -N 20 -D /var/lib/dns_cache.db $i
149done
150
151  The above is an example of how to run through multiple log files
152  creating a single DNS cache file.. this might be typically used on
153  a larger site that has many virtual hosts, all keeping their log
154  files in a separate directory.  It will process each access_log it
155  finds in /var/log/* and create a cache file (var/lib/dns_cache.db).
156  This cache file can then be used to process the logs normally with
157  with the Webalizer in a read-only fashion (see next example).
158
159
160for i in /etc/webalizer/*.conf; do webalizer -c $i -D /etc/cache.db; done
161
162  This will process each configuration file found in /etc/webalizer,
163  using the DNS cache file /etc/cache.db.  This will also typically be
164  used on a larger site with multiple hosts..  Each configuration file
165  will specify a site specific log file, hostname, output directory, etc.
166  The cache file used will typically be created using a command similar
167  to the one previous to this example.
168
169
170Cache File Maintenance
171----------------------
172
173The Webalizer DNS cache files generally require very little or no
174special attention.  There are times though when some maintenance
175is required, such as occasional purging of very old cache entries.
176The Webalizer never removes a record once it's inserted into the
177cache.  If a record expires based on its timestamp, the next time
178that address is seen in a log, its name is looked up again and the
179timestamp is updated.  However, there will always be addresses that
180are never seen again, which will cause the cache files to continue
181to grow in size over time.  On extremely busy sites or sites that
182attract many one time visitors, the cache file may grow extremely
183large, yet only contain a small amount of valid entries.  Using
184The Webalizer (DNS) Cache file Manager ('wcmgr'), cache files can
185be purged, removing expired entries and shrinking the file size.
186A TTL (time to live) value can be specified, so the length of time
187an entry remains in the cache can be varied depending on individual
188site requirements.  In addition to purging cache files, 'wcmgr' can
189also be used to list cache file contents, import/export cache data,
190lookup/add/delete individual entries and gather overall statistics
191regarding the cache file (number of records, number expired, etc..).
192
193To purge a cache file using 'wcmgr', an example command would be:
194
195wcmgr -p31 /path/to/dns.cache
196
197This would purge the 'dns.cache' cache file of any records that are
198over 31 days old, and would reclaim the space that those records
199were using in the file.  If you would like to see the records that
200get purged, adding the command line option '-v' (verbose) will cause
201the program to print each entry and its age as they are removed.
202You can also use the 'wcmgr' to display statistics on cache files
203to aid in determining when a cache file should be purged.  See the
204'wcmgr' man page (wcmgr.1) for additional information on the various
205options available.
206
207
208Stupid Cache Tricks
209-------------------
210
211The DNS cache files used by The Webalizer allow for efficient IP address
212to name translations.  Resolved names are normally generated by using an
213existing DNS name server to query the address, either locally or over
214the Internet.  However, using The Webalizer (DNS) Cache file Manager,
215almost any IP address to Name translation can be included in the cache.
216One such example would be for mapping local network addresses to real
217names, even though those addresses may not have real DNS entries on the
218network (or may be 'local' addresses prohibited from use on the Internet).
219A simple tab delimited text file can be created and imported into a cache
220for use by The Webalizer,  which will then be used to convert the local
221IP addresses to real names.  Additional configuration options for The
222Webalizer can then be used as would be normally.  For example, consider
223a small business with 10 computers and a DSL router to the Internet.
224Each machine on the local network would use a private IP address that
225would not be resolved using an external (public) DNS server, so would
226always be reported by The Webalizer as 'unknown/unresolved'.  A simple
227cache file could be created to map those unresolved addresses into more
228meaningful names, which could then be further processed by the Webalizer.
229An example might look something like:
230
231# Local machines
232192.168.123.254	0	0	gw.widgetsareus.lan
233192.168.123.253	0	0	mail.widgetsareus.lan
234192.168.123.250	0	0	sales.widgetsareus.lan
235192.168.123.240	0	0	service.widgetsareus.lan
236192.168.123.237	0	0	mgr.widgetsareus.lan
237192.168.123.235	0	0	support1.widgetsareus.lan
238192.168.123.234	0	0	support2.widgetsareus.lan
239192.168.123.232	0	0	pres.widgetsareus.lan
240192.168.123.230	0	0	vp.widgetsareus.lan
241192.168.123.225	0	0	reception.widgetsareus.lan
242192.168.123.224	0	0	finance.widgetsareus.lan
243127.0.0.1	0	1	127.0.0.1
244
245
246There are a couple of things here that should be noted.  The first
247is that the timestamps (first zero on each line above) are set to
248zero.  This tells The Webalizer that these cached entries are to
249be considered 'permanent', and should never be expired (infinite
250TTL or time to live).  The second thing to note is that the resolved
251names are using a non-standard TLD (top level domain) of '.lan'.
252The Webalizer will map this special TLD to mean "Local Network" in
253its reports, which allows local traffic to be grouped separately
254from normal Internet traffic.  Lastly, you may notice that the
255last line of the file contains an entry with the same IP address
256where a name should be.  This entry will prevent the Webalizer
257from ever trying to lookup 127.0.0.1,  which is the 'localhost'
258address, when it is found in a log.  The second number after the IP
259address (1) tells the Webalizer that it is an unresolved entry, not
260a resolved hostname (ie: has no name).  Entries such as this one can
261be used to reduce DNS lookups on addresses that are known not to
262resolve.
263
264
265Considerations
266--------------
267
268Processing of live log files is discouraged, as the chances of log records
269being written between the time of DNS resolution and normal processing will
270cause problems.
271
272If you are using STDIN for the input stream (log file) and have run-time
273DNS cache file creation/update enabled.. the program will exit after the
274cache file has been created/updated and no output will be produced.  If
275you must use STDIN for the input log, you will need to process the stream
276twice, once to create/update the cache file, and again to produce the
277reports.  The reason for this is that stream inputs from STDIN cannot
278be 'rewound' to the beginning like files can, so must be given twice.
279
280Cached DNS addresses have a default TTL (time to live) of 7 days.  This
281may now be changed using the CacheTTL config file keyword to any value
282from 1 to 100 (days).  You may also now specify if unresolved addresses
283should be stored in the DNS cache.  Normally, unresolved IP addresses
284are NOT saved in the cache and are looked up each time the program is
285run.
286
287There is an absolute maximum of 100 child processes that may be created,
288however the actual number of children should be significantly less than
289the maximum.. typical usage should be between 5 and 20.
290
291Special thanks to Henning P. Schmiedehausen <hps@tanstaafl.de> for the
292original dns-resolver code he submitted,  which was the basis for this
293implementation.  Also thanks to Jose Carlos Medeiros for the inital IPv6
294support code.
295
296

README

1The Webalizer - A web server log file analysis tool
2Copyright 1997-2013 by Bradford L. Barrett
3
4Distributed under the GNU GPL.  See the files "COPYING" and
5"Copyright" supplied with the distribution for additional info.
6
7
8What is The Webalizer?
9----------------------
10
11The Webalizer is a web server log file analysis program which produces
12usage statistics in HTML format for viewing with a browser.  The results
13are presented in both columnar and graphical format, which facilitates
14interpretation.  Yearly, monthly, daily and hourly usage statistics are
15presented, along with the ability to display usage by site, URL, referrer,
16user agent (browser), search string, entry/exit page, username and country
17(some information is only available if supported and present in the log
18files being processed).  Processed data may also be exported into most
19database and spreadsheet programs that support tab delimited data formats.
20
21The Webalizer supports CLF (common log format) log files, as well as
22Combined log formats as defined by NCSA and others, and variations
23of these which it attempts to handle intelligently.  In addition, The
24Webalizer supports wu-ftpd xferlog (FTP) formatted logs, squid proxy logs
25and W3C extended format logs.
26
27Gzip compressed logs may be used as input directly.   Any log filename
28that ends with a '.gz' extension will be assumed to be in gzip format and
29uncompressed on the fly as it is being read.  The Webalizer now also has
30the ability to handle BZip2 compressed logs, if enabled at compile time.
31Similar to gzipped logs, any log filename that ends with a '.bz2' will be
32assumed to be in bzip2 format and uncompressed on the fly as it is being
33read.
34
35For sites that do not enable hostname lookups (DNS resolution) on their
36web servers (and have only IP addresses in their logs), The Webalizer
37provides its own internal DNS lookup capability as well as geolocation
38services (GeoDB).  The optional GeoIP library from MaxMind Inc. is also
39supported and may be used instead of the native GeoDB database.
40
41A utility program, "The Webalizer (DNS) Cache file Manager", or 'wcmgr'
42is also provided which allows the creation and manipulation of the DNS
43cache files used and produced by the webalizer.  See the file DNS.README
44for additional information regarding DNS support.
45
46This documentation applies to The Webalizer Version 2.23
47
48Running the Webalizer
49---------------------
50
51The Webalizer was designed to be run from a Unix command line prompt or
52as a cron job.  There are several command line options which will modify
53the results it produces, and configuration files can be used as well.
54The format of the command line is:
55
56webalizer [options ...] [log-file]
57
58Where 'options' can be one or more of the supported command line
59switches described below.  'log-file' is the name of the log file
60to process (see below for more detailed information).  If a dash
61("-") is specified for the log-file name, STDIN will be used.
62
63
64Once executed, the general flow of the program follows:
65
66o A default configuration file is scanned for.  A file named
67  'webalizer.conf' is searched for in the current directory, and if
68  found, its configuration data is parsed.  If the file is not
69  present in the current directory,  the file '/etc/webalizer.conf'
70  is searched for and, if found, is used instead.
71
72o Any command line arguments given to the program are parsed.  This
73  may include the specification of a configuration file, which is
74  processed at the time it is encountered.
75
76o If a log file was specified, it is opened and made ready for
77  processing.  If no log file was given, or the filename '-' is
78  specified on the command line, STDIN is used for input.
79
80o If an output directory was specified, the program does a 'chdir' to
81  that directory in preparation for generating output.  If no output
82  directory was given, the current directory is used.
83
84o If a non-zero number of DNS Children processes were specified, they
85  will be started, and the specified log file will be processed,
86  either creating or updating the specified DNS cache file.
87
88o If no hostname was given, the program attempts to get the hostname
89  using a uname system call.  If that fails, 'localhost' is used.
90
91o A history file is searched for.  This file keeps previous month
92  totals used on the main index.html page.  The default file is
93  named 'webalizer.hist', kept in the specified output directory,
94  however may be changed using the "HistoryName" configuration file
95  keyword.
96
97o If incremental processing was specified, a data file is searched for
98  and loaded if found, containing the 'internal state' data of the
99  program at the end of a previous run.  The default file is named
100  'webalizer.current', kept in the specified output directory, however
101  may be changed using the "IncrementalName" configuration file keyword.
102
103o Main processing begins on the log file.  If the log spans multiple
104  months, a separate HTML document is created for each month.
105
106o After main processing, the main 'index.html' page is created, which
107  has totals by month and links to each months HTML document.
108
109o A new history file is saved to disk, which includes totals generated
110  by The Webalizer during the current run.
111
112o If incremental processing was specified, a data file is written that
113  contains the 'internal state' data at the end of this run.
114
115
116Incremental Processing
117----------------------
118
119Version 1.2x of The Webalizer adds incremental run capability.  Simply
120put, this allows processing large log files by breaking them up into
121smaller pieces, and processing these pieces instead.  What this means
122in real terms is that you can now rotate your log files as often as you
123want, and still be able to produce monthly usage statistics without the
124loss of any detail.  This is accomplished by saving and restoring all
125relevant internal data to a disk file between runs.  Doing so allows the
126program to 'start where it left off' so to speak, and allows the
127preservation of detail from one run to the next.
128
129Some special precautions need to be taken when using the incremental
130run capability of The Webalizer.  Configuration options should not be
131changed between runs, as that could cause corruption of the internal
132stored data.  For example, changing the MangleAgents level will cause
133different representations of user agents to be stored, producing invalid
134results in the user agents section of the report.  If you need to change
135configuration options, do it at the end of the month after normal
136processing of the previous month and before processing the current month.
137You may also want to delete the 'webalizer.current' file as well (or
138whatever name was specified using the "IncrementalName" configuration
139option).
140
141The Webalizer also attempts to prevent data duplication by keeping
142track of the timestamp of the last record processed.  This timestamp
143is then compared to current records being processed, and any records
144that were logged previous to that timestamp are ignored.  This, in
145theory, should allow you to re-process logs that have already been
146processed, or process logs that contain a mix of processed/not yet
147processed records, and not produce duplication of statistics.  The
148only time this may break is if you have duplicate timestamps in two
149separate log files... any records in the second log file that do have
150the same timestamp as the last record in the previous log file processed,
151will be discarded as if they had already been processed.  There are
152lots of ways to prevent this however, for example, stopping the web
153server before rotating logs will prevent this situation.  This setup
154also necessitates that you always process logs in chronological order,
155otherwise data loss will occur as a result of the timestamp compare.
156
157
158Output Produced
159---------------
160
161The Webalizer produces several reports (html) and graphics for each
162month processed.  In addition, a summary page is generated for the
163current and previous months (up to 12), a history file is created
164and if incremental mode is used, the current month's processed data.
165The exact location and names of these files can be changed using
166configuration files and command line options.  The files produced,
167(default names) are:
168
169index.html              - Main summary page (extension may be changed)
170usage.png               - Yearly graph displayed on the main index page
171usage_YYYYMM.html       - Monthly summary page (extension may be changed)
172usage_YYYYMM.png        - Monthly usage graph for specified month/year
173daily_usage_YYYYMM.png  - Daily usage graph for specified month/year
174hourly_usage_YYYYMM.png - Hourly usage graph for specified month/year
175site_YYYYMM.html        - All sites listing (if enabled)
176url_YYYYMM.html         - All urls listing (if enabled)
177ref_YYYYMM.html         - All referrers listing (if enabled)
178agent_YYYYMM.html       - All user agents listing (if enabled)
179search_YYYYMM.html      - All search strings listing (if enabled)
180webalizer.hist          - Previous month history (may be changed)
181webalizer.current       - Incremental Data (may be changed)
182site_YYYYMM.tab         - tab delimited sites file
183url_YYYYMM.tab          - tab delimited urls file
184ref_YYYYMM.tab          - tab delimited referrers file
185agent_YYYYMM.tab        - tab delimited user agents file
186user_YYYYMM.tab         - tab delimited usernames file
187search_YYYYMM.tab       - tab delimited search string file
188
189The yearly (index) report shows statistics for a 12 month period, and
190links to each month.  The monthly report has detailed statistics for
191that month with additional links to any URLs and referrers found.
192The various totals shown are explained below.
193
194Hits
195
196  Any request made to the server which is logged, is considered a 'hit'.
197The requests can be for anything... html pages, graphic images, audio
198files, CGI scripts, etc...  Each valid line in the server log is
199counted as a hit.  This number represents the total number of requests
200that were made to the server during the specified report period.
201
202Files
203
204  Some requests made to the server, require that the server then send
205something back to the requesting client, such as a html page or graphic
206image.  When this happens, it is considered a 'file' and the files
207total is incremented.  The relationship between 'hits' and 'files' can
208be thought of as 'incoming requests' and 'outgoing responses'.
209
210Pages
211
212  Pages are, well, pages!  Generally, any HTML document, or anything
213that generates an HTML document, would be considered a page.  This
214does not include the other stuff that goes into a document, such as
215graphic images, audio clips, etc...  This number represents the number
216of 'pages' requested only, and does not include the other 'stuff' that
217is in the page.  What actually constitutes a 'page' can vary from
218server to server.  The default action is to treat anything with the
219extension '.htm', '.html' or '.cgi' as a page.  A lot of sites will
220probably define other extensions, such as '.phtml', '.php3' and '.pl'
221as pages as well.  Some people consider this number as the number of
222'pure' hits... I'm not sure if I totally agree with that viewpoint.
223Some other programs (and people :) refer to this as 'Pageviews'.
224
225Sites
226
227  Each request made to the server comes from a unique 'site', which can
228be referenced by a name or ultimately, an IP address.  The 'sites'
229number shows how many unique IP addresses made requests to the server
230during the reporting time period.  This DOES NOT mean the number of
231unique individual users (real people) that visited, which is impossible
232to determine using just logs and the HTTP protocol (however, this
233number might be about as close as you will get).
234
235Visits
236
237  Whenever a request is made to the server from a given IP address
238(site), the amount of time since a previous request by the address
239is calculated (if any).  If the time difference is greater than a
240pre-configured 'visit timeout' value (or has never made a request before),
241it is considered a 'new visit', and this total is incremented (both
242for the site, and the IP address).  The default timeout value is 30
243minutes (can be changed), so if a user visits your site at 1:00 in
244the afternoon, and then returns at 3:00, two visits would be registered.
245Note: in the 'Top Sites' table, the visits total should be discounted
246on 'Grouped' records, and thought of as the "Minimum number of visits"
247that came from that grouping instead.  Note: Visits only occur on
248PageType requests, that is, for any request whose URL is one of the
249'page' types defined with the PageType and PagePrefix option, and not
250excluded by the OmitPage option.  Due to the limitation of the HTTP
251protocol, log rotations and other factors, this number should not be
252taken as absolutely accurate,  rather, it should be considered a pretty
253close "guess".
254
255KBytes
256
257  The KBytes (kilobytes) value shows the amount of data, in KB, that
258was sent out by the server during the specified reporting period.  This
259value is generated directly from the log file, so it is up to the
260web server to produce accurate numbers in the logs  (some web servers
261do stupid things when it comes to reporting the number of bytes).  In
262general, this should be a fairly accurate representation of the amount
263of outgoing traffic the server had, regardless of the web servers
264reporting quirks.
265
266Note: A kilobyte is 1024 bytes, not 1000 :)
267
268Top Entry and Exit Pages
269
270  The Top Entry and Exit tables give a rough estimate of what URLs
271are used to enter your site, and what the last pages viewed are.
272Because of limitations in the HTTP protocol, log rotations, etc...
273this number should be considered a good "rough guess" of the actual
274numbers, however will give a good indication of the overall trend in
275where users come into, and exit, your site.
276
277
278Command Line Options
279--------------------
280
281The Webalizer supports many different configuration options that will
282alter the way the program behaves and generates output.  Most of these
283can be specified on the command line, while some can only be specified
284in a configuration file. The command line options are listed below,
285with references to the corresponding configuration file keywords.
286
287--------------------------------------------------------------------------
288
289General Options
290---------------
291
292-h        Display all available command line options and exit program.
293
294-v        Be Verbose.  This will cause the program to print additional
295          information at run time.  It is the same as specifying
296          "Quiet no", "ReallyQuiet no" and "Debug yes" config options.
297
298-V        Display the program version and exit.  Additional program
299          specific information will be displayed if 'verbose' mode is
300          also used (e.g. '-vV'), which can be useful when submitting
301          bug reports.
302
303-d        Display additional 'debugging' information for errors and
304          warnings produced during processing.  This normally would
305          not be used except to determine why you are getting all those
306          errors and wanted to see the actual data.  Normally The
307          Webalizer will just tell you it found an error, not the
308          actual data.  This option will display the data as well.
309          Config file keyword: Debug
310
311-F        Specify the log file type to process.  Normally, the
312          Webalizer expects to find a valid CLF or Combined format
313          we server log file.  This option allows you to process
314          wu-ftpd xferlogs, squid and W3C formatted web logs as well.
315          Values can be either 'clf', 'ftp', 'squid' or 'w3c' with
316          'clf' being the default.  Only the first character needs
317          to be specified (eg: -Fs will process a squid log).
318          Config file keyword: LogType
319
320-f        Fold out of sequence log records back into analysis, by
321          treating them as if they were the same date/time as the
322          last good record.  Normally, out of sequence log records
323          are ignored.  If you run apache, don't worry about this.
324          Config file keyword: FoldSeqErr
325
326-i        Ignore history file.  USE WITH CAUTION.  This causes The
327          Webalizer to ignore any existing history file produced from
328          previous runs and generate its output from scratch.  The
329          effect will be as if The Webalizer is being run for the
330          first time and any previous statistics will be lost (although
331          the HTML documents, if any, will not be deleted) on the main
332          index.html (yearly) web page.
333          Config file keyword: IgnoreHist
334
335-b        Ignore incremental data file.  USE WITH CAUTION.  This causes
336          The Webalizer to ignore any existing incremental (state) data
337          file produced by previous runs.  By ignoring the incremental
338          data file, all previous processing for the current month will
339          be lost, and those logs must be re-processed.
340          Config file keyword: IgnoreState
341
342-p        Preserve state (incremental processing).  This allows the
343          processing of partial logs in increments.  At the end of
344          the program, all relevant internal data is saved, so that
345          it may be restored the next time the program is run.  This
346          allows sites that must rotate their logs more than once a
347          month to still be able to use The Webalizer, and not worry
348          about having to gather and feed an entire months logs to
349          the program at the end of the month.  See the section on
350          "Incremental Processing" below for additional information.
351          The default is to not perform incremental processing.  Use
352          this command line option to enable the feature.
353          Config file keyword: Incremental
354
355-q        Quiet mode.  Normally, The Webalizer will produce various
356          messages while it runs letting you know what its doing.
357          This option will suppress those messages.  It should be
358          noted that this WILL NOT suppress errors and warnings, which
359          are output to STDERR.
360          Config file keyword: Quiet
361
362-Q        ReallyQuiet mode.  This allows suppression of _all_ messages
363          generated by The Webalizer, including warnings and errors.
364          Useful when The Webalizer is run as a cron job.
365          Config file keyword: ReallyQuiet
366
367-T        Display timing information.  The Webalizer keeps track of the
368          time it begins and ends processing, and normally displays the
369          total processing time at the end of each run.  If quiet mode
370          (-q or 'Quiet yes' in configuration file) is specified, this
371          information is not displayed.  This option forces the display
372          of timing totals if quiet mode has been specified, otherwise
373          it is redundant and will have no effect.
374          Config file keyword: TimeMe
375
376-c file   This option specifies a configuration file to use.  Configuration
377          files allow greater control over how The Webalizer behaves, and
378          there are several ways to use them.  As of version 0.98, The
379          Webalizer searches for a default configuration file in the
380          current directory named "webalizer.conf", and if not found,
381          will search in the /etc/ directory for a file of the same name.
382          In addition, you may specify a configuration file to use with
383          this command line option.
384
385-n name   This option specifies the hostname for the reports generated.
386          The hostname is used in the title of all reports, and is also
387          prepended to URLs in the reports.  This allows The Webalizer
388          to be run on log files for 'virtual' web servers or web servers
389          that are different than the machine the reports are located on,
390          and still allows clicking on the URLs to go to the proper
391          location.  If a hostname is not specified, either on the
392          command line or in a configuration file, The Webalizer attempts
393          to determine the hostname using a 'uname' system call.  If this
394          fails, "localhost" will be used as the hostname.
395          Config file keyword: HostName
396
397-o dir    This options specifies the output directory for the reports.
398          If not specified here or in a configuration file, the current
399          default directory will be used for output.
400          Config file keyword: OutputDir
401
402-x name   This option allows the generated pages to have an extension
403          other than '.html', which is the default.  Do not include the
404          leading period ('.') when you specify the extension.
405          Config file keyword: HTMLExtension
406
407-P name   Specify the file extensions for 'pages'.  Pages (sometimes
408          called 'PageViews') are normally html documents and CGI
409          scripts that display the whole page, not just parts of it.
410          Some system will need to define a few more, such as 'phtml',
411          'php3' or 'pl' in order to have them counted as well.  The
412          default is 'htm*' and 'cgi' for web logs and 'txt' for ftp.
413          Config file keyword: PageType
414
415-O name   Specify URLs which are not counted as 'pages'.  Requests
416          matching one of these URLs will not be counted as a page, even
417          if they have an extension matching one of the PageTypes defined
418          above or have no extension at all.
419          Config file keyword: OmitPage
420
421-t name   This option specifies the title string for all reports.  This
422          string is used, in conjunction with the hostname (if not blank)
423          to produce the actual title.  If not specified, the default of
424          "Usage Statistics for" will be used.
425          Config file keyword: ReportTitle
426
427-Y        Suppress Country graph.  Normally, The Webalizer produces
428          country statistics in both Graph and Columnar forms.  This
429          option will suppress the Country Graph from being generated.
430          Config file keyword: CountryGraph
431
432-G        Suppress hourly graph.  Normally, The Webalizer produces
433          hourly statistics in both Graph and Columnar forms.  This
434          option will suppress the Hourly Graph only from being generated.
435          Config file keyword: HourlyGraph
436
437-H        Suppress Hourly statistics.  Normally, The Webalizer produces
438          hourly statistics in both Graph and Columnar forms.  This
439          option will suppress the Hourly Statistics table only from
440          being generated.
441          Config file keyword: HourlyStats
442
443-K num    Specify how many months should be displayed in the main index
444          (yearly summary) table.  Default is 12 months.  Can be set to
445          anything between 12 and 120 months (1 to 10 years).
446          Config file keyword: IndexMonths
447
448-k num    Specify how many months should be displayed in the main index
449          (yearly summary) graph.  Default is 12 months.  Can be set to
450          anything between 12 and 72 months (1 to 6 years).
451          Config file keyword: GraphMonths
452
453-L        Disable Graph Legends.  The color coded legends displayed on
454          the in-line graphs can be disabled with this option.  The
455          default is to display the legends.
456          Config file keyword: GraphLegend
457
458-l num    Graph Lines.  Specify the number of background reference
459          lines displayed on the in-line graphics produced.  The default
460          is 2 lines, however can range anywhere from zero ('0') for
461          no lines, up to 20 lines (looks funny!).
462          Config file keyword: GraphLines
463
464-P name   Page type.  This is the extension of files you consider to
465          be pages for Pages calculations (sometimes called 'pageviews').
466          The default is 'htm*' and 'cgi' (plus whatever HTMLExtension
467          you specified if it is different).  Don't use a period!
468
469-m num    Specify a 'visit timeout'.  Visits are calculated by looking at
470          the time difference between the current and last request made
471          by a specific host.  If the difference is greater that the
472          visit timeout value, the request is considered a new visit.
473          This value is specified in number of seconds.  The default
474          is 30 minutes (1800).
475          Config file keyword: VisitTimeout
476
477-M num    Mangle user agent names.  Normally, The Webalizer will keep
478          track of the user agent field verbatim.  Unfortunately, there are
479          a ton of different names that user agents go by, and the field
480          also reports other items such as machine type and OS used. For
481          Example, Netscape 4.03 running on Windows 95 will report a
482          different string than Netscape 4.03 running on Windows NT, so even
483          though they are the same browser type, they will be considered
484          as two totally different browsers by The Webalizer.  For that
485          matter, Netscape 4.0 running on Windows NT will report different
486          names if one is run on an Alpha and the other on an Intel
487          processor!  Internet Exploder is even worse, as it reports itself
488          as if it were Netscape and you have to search the given string a
489          little deeper to discover that it is really MSIE!  In order to
490          consolidate generic browser types, this option will cause The
491          Webalizer to 'mangle' the user agent field, attempting to
492          consolidate generic browser types. There are 6 levels that can be
493          specified, each producing different levels of detail.  Level 5
494          displays only the browser name (MSIE or Mozilla) and the major
495          version number.  Level 4 will also display the minor version
496          number (single decimal place).  Level 3 will display the minor
497          version number to two decimal places.  Level 2 will add any
498          sub-level designation (such as Mozilla/3.01Gold or MSIE 3.0b).
499          Level 1 will also attempt to add the system type.  The default
500          Level 0 will disable name mangling and leave the user agent
501          field unmodified, producing the greatest amount of detail.
502          Configuration file keyword: MangleAgents
503
504-g num    This option allows you to specify the level of domains name
505          grouping to be performed.  The numeric value represents the
506          level of grouping, and can be thought of as the 'number of
507          dots' to be displayed.  The default value of 0 disables any
508          domain name grouping.
509          Configuration file keyword: GroupDomains
510
511-D name   This allows the specification of a DNS Cache file name.  This
512          filename MUST be specified if you have dns lookups enabled
513          (using the -N command line switch or DNSChildren configuration
514          keyword).  The filename is relative to the default output
515          directory if an absolute path is not specified (ie: starts
516          with a leading '/').  This option is only available if DNS
517          support was enabled at compile time, otherwise an 'Invalid
518          Keyword' error will be generated.  See the DNS.README file
519          for additional information regarding DNS lookups.
520          Configuration file keyword: DNSCache
521
522-N num    Number of DNS child processes to use for reverse DNS lookups.
523          If specified, a DNSCache name MUST be specified also.  If you
524          do not wish a DNS cache file to be generated, specify a value
525          of zero ('0') to disable it.  This does not prevent using an
526          existing cache file, only the generation of one at run time.
527          See the DNS.README file for additional information.
528          Configuration file keyword: DNSChildren
529
530-j        Enable native GeoDB geolocation services.
531          Configuration file keyword: GeoDB
532
533-J name   Specify an alternate GeoDB database filename to use.  This
534          shouldn't normally be needed.  If used, the filename 'name'
535          is relative to the output directory being used unless an
536          absolute path is specified (ie: starts with a leading '/').
537          Configuration file keyword: GeoDBDatabase
538
539-w        Enable GeoIP support if it is available.
540          Configuration file keyword: GeoIP
541
542-W name   Specify an alternate GeoIP database filename to use.  This
543          shouldn't normally be needed.  If used, the filename 'name'
544          is relative to the specified output directory unless an
545          absolute name is given (ie: starts with a leading '/').
546          Configuration file keyword: GeoIPDatabase
547
548-z name   Specify location of the country flag graphics and enable
549          their display in the top country table.  The directory name
550          is relative to the output directory unless an absolute path
551          is specified (ie: starts with a leading '/').
552          Configuration file keyword: FlagDir
553
554Hide Options
555------------
556
557The following options take a string argument to use as a comparison
558for matching.  Except for the IndexAlias option, the string argument
559can be plain text, or plain text that either starts or ends with the
560wildcard character '*'.
561
562For Example:
563
564Given the string "yourmama/was/here", the arguments "was", "*here" and
565"your*" will all produce a match.
566
567
568-a name   This option allows hiding of user agents (browsers) from the
569          "Top User Agents" table in the report.  This option really
570          isn't too useful as there are a zillion different names that
571          current browsers go by, depending where they were obtained,
572          however you might have some particular user agents that hit
573          your site a lot that you would like to exclude from the list.
574          You must have a web server that includes user agents in its
575          log files for this option to be of any use.  In addition, it
576          is also useless if you disable the user agent table in the
577          report (see the -A command line option or "TopAgents"
578          configuration file keyword). You can specify as many of these
579          as you want on the command line.  The wildcard character '*'
580          can be used either in front of or at the end of the string.
581          (ie: Mozilla/4.0* would match anything that starts with the
582          string "Mozilla/4.0").
583          Config file keyword: HideAgent
584
585-r name   This option allows hiding of referrers from the "Top Referrer"
586          table in the report.  Referrers are URLs, either on your own
587          local site or a remote site, that referred the user to a URL
588          on your web server.  This option is normally used to hide
589          your own server from the table, as your own pages are usually
590          the top referrers to your own pages (well, you get the idea).
591          You must have a web server that includes referrer information
592          in the log files for this option to be of any use.  In addition,
593          it is also useless if you disable the referrers table in the
594          report (see the -R command line option or "TopReferrers"
595          configuration file keyword).  You can specify as many of these
596          as you like on the command line.
597          Config file keyword: HideReferrer
598
599-s name   This option allows hiding of sites from the "Top Sites" table
600          in the report.  Normally, you will only want to hide your own
601          domain name from the report, as it usually is one of the top
602          sites to visit your web server.  This option is of no use if
603          you disable the top sites table in the report (see the -S
604          command line option or "TopSites" configuration file option).
605          Config file keyword: HideSite
606
607-X        This causes all individual sites to be hidden, which results
608          in only grouped sites to be displayed on the report.
609          Config file keyword: HideAllSites
610
611-u name   This option allows hiding of URLs from the "Top URLs" table
612          in the report.  Normally, this option is used to hide images,
613          audio files and other objects your web server dishes out that
614          would otherwise clutter up the table.  This option is of no
615          use if you disable the top URLs table in the report (see the
616          -U command line option or "TopURLs" configuration file keyword).
617          Config file keyword: HideURL
618
619-I name   This option allows you to specify additional index.html aliases.
620          The Webalizer usually strips the string 'index.*' from URLs
621          before processing (unless disabled using the 'DefaultIndex'
622          config option), which has the effect of turning a URL such
623          as /somedir/index.html into just /somedir/ which is really the
624          same URL and should be treated as such.  This option allows you
625          to specify _additional_ strings that are to be treated the same
626          way.  Use with care, improper use could cause unexpected results.
627          For example, if you specify the alias string of 'home', a URL
628          such as /somedir/homepages/brad/home.html would be converted
629          into just /somedir/ which probably isn't what was intended.
630          This option is useful if your web server uses a different default
631          index page other than the standard 'index.html' or 'index.htm',
632          such as 'home.html' or 'homepage.html'.  The string specified
633          is searched for _anywhere_ in the URL, so "home.htm" would
634          turn both "/somedir/home.htm" and "/somedir/home.html" into
635          just "/somedir/". Wildcards are _not_ allowed on this one.
636          Config file keyword: IndexAlias
637
638Table Size Options
639------------------
640
641-e num    This option specifies the number of entries to display in the
642          "Top Entry Pages" table.  To disable the table, use a value of
643          zero (0).
644          Config file keyword: TopEntry
645
646-E num    This option specifies the number of entries to display in the
647          "Top Exit Pages" table.  To disable the table, use a value of
648          zero (0).
649          Config file keyword: TopExit
650
651-A num    This option specifies the number of entries to display in the
652          "Top User Agents" table.  To disable the table, use a value of
653          zero (0).
654          Config file keyword: TopAgents
655
656-C num    This option specifies the number of entries to display in the
657          "Top Countries" table.  To disable the table, use a value of
658          zero (0).
659          Config file keyword: TopCountries
660
661-R num    This option specifies the number of entries to display in the
662          "Top Referrers" table.  To disable the table, use a value of
663          zero (0).
664          Config file keyword: TopReferrers
665
666-S num    This option specifies the number of entries to display in the
667          "Top Sites" table.  To disable the table, use a value of
668          zero (0).
669          Config file keyword: TopSites
670
671-U num    This option specifies the number of entries to display in the
672          "Top URLs" table.  To disable the table, use a value of
673          zero (0).
674          Config file keyword: TopURLs
675
676--------------------------------------------------------------------------
677
678
679CONFIGURATION FILES
680-------------------
681
682The Webalizer allows configuration files to be used in order to simplify
683life for all.  There are several ways that configuration files are accessed
684by the Webalizer.  When The Webalizer first executes, it looks for a
685default configuration file named "webalizer.conf" in the current directory,
686and if not found there, will look for "/etc/webalizer.conf".  In addition,
687configuration files may be specified on the command line with the '-c'
688option.  There are lots of different ways you can combine the use of
689configuration files and command line options to produce various results.
690The Webalizer always looks for and reads configuration options from a
691default configuration file before doing anything else.  Because of this,
692you can override options found in the default file by use of additional
693configuration files specified on the command line or command line options
694themselves.  If you specify a configuration file on the command line, you
695can override options in it by additional command line options which follow.
696For example, most users will most likely want to create the default file
697/etc/webalizer.conf and place options in it to specify the hostname, log
698file, table options, etc...  At the end of the month when a different log
699file is to be used (the end of month log), you can run The Webalizer as
700usual, but put the different filename on the end of the command line, which
701will override the log file specified in the configuration file.  It should
702be noted that you cannot override some configuration file options by the
703use of command line arguments.  For example, if you specify "Quiet yes" in
704a configuration file, you cannot override this with a command line argument,
705as the command line option only _enables_ the feature (-q option).
706
707The configuration files are standard ASCII text files that may be created
708or edited using any standard editor.  Blank lines and lines that begin
709with a pound sign ('#') are ignored.  Any other lines are considered to
710be configuration lines, and have the form "Keyword Value", where the
711'Keyword' is one of the currently available configuration keywords defined
712below, and 'Value' is the value to assign to that particular option.  Any
713text found after the keyword up to the end of the line is considered the
714keyword's value, so you should not include anything after the actual value
715on the line that is not actually part of the value being assigned.  The
716file "sample.conf" provided with the distribution contains lots of useful
717documentation and examples as well.  It should be noted that you do not
718have to use any configuration files at all, in which case, default values
719will be used (which should be sufficient for most sites).
720
721--------------------------------------------------------------------------
722
723General Configuration Keywords
724------------------------------
725
726LogFile       This defines the log file to use. It should be a fully qualified
727              name (ie: contain the path), but relative names will work as
728              well.  If not specified, the logfile defaults to STDIN.
729
730LogType       This specified the log file type being used.  Normally, The
731              Webalizer processes web logs in either CLF or Combined format.
732              You may also process wu-ftpd xferlog formatted logs, squid
733              proxy logs or W3C formatted web logs by setting the appropriate
734              type using this keyword.   Values may be either 'clf', 'ftp',
735              'squid' or 'w3c'.  Ensure that you specify the proper file type,
736              otherwise you will be presented with a long stream of 'invalid
737              record' messages when the Webalizer is run ;)
738              Command line argument: -F
739
740OutputDir     This defines the output directory to use for the reports.  If
741              it is not specified, the current directory is used.
742              Command line argument: -o
743
744HistoryName   Allows specification of a history path/filename if desired.
745              The default is to use the file named 'webalizer.hist', kept
746              in the normal output directory (OutputDir above).  Any name
747              specified is relative to the normal output directory unless
748              an absolute path name is given (ie: starts with a '/').
749
750ReportTitle   This specifies the title to use for the generated reports.
751              It is used in conjunction with the hostname (unless blank)
752              to produce the final report titles.  If not defined, the
753              default of "Usage Statistics for" is used.
754              Command line argument: -t
755
756HostName      This defines the hostname.  The hostname is used in the
757              report title as well as being prepended to URLs in the
758              "Top URLs" table.  This allows The Webalizer to be run
759              on "virtual" web servers, or servers that do not reside
760              on the local machine, and allows clicking on the URL to
761              go to the right place.  If not specified, The Webalizer
762              attempts to get the hostname via a 'uname' system call,
763              and if that fails, will default to "localhost".
764              Command line argument: -n
765
766UseHTTPS      Causes the links in the 'Top URLs' table to use 'https://'
767              instead of the default 'http://' prefix.  Not much use if
768              you run a mix of secure/insecure servers on your machine.
769              Only useful if you run the analysis on a secure servers
770              logs, and want the links in the table to work properly.
771
772HTAccess      Enables the creation of a default .htaccess file in the
773              output directory.  If enabled, the file will be created
774              (with a single "DirectoryIndex" directive),  unless one
775              already exists.  The default is 'no', which disables the
776              creation of any .htaccess files.
777
778Quiet         This allows you to enable or disable informational messages
779              while it is running.  The values for this keyword can be
780              either 'yes' or 'no'.  Using "Quiet yes" will suppress these
781              messages, while "Quiet no" will enable them.  The default
782              is 'no' if not specified, which will allow The Webalizer
783              to display informational messages.  It should be noted that
784              this option has no effect on Warning or Error messages that
785              may be generated, as they go to STDERR.
786              Command line argument: -q
787
788ReallyQuiet   This allows all generated output to be suppressed, including
789              warning and error messages.  The values for this keyword
790              can be either 'yes' or 'no', with 'no' being the default.
791              Command line argument: -Q
792
793TimeMe        This allows you to display timing information regardless of
794              any "quiet mode" specified.  Useful only if you did in fact
795              tell the webalizer to be quiet either by using the -q command
796              line option or the "Quiet" keyword, otherwise timing stats
797              are normally displayed anyway.  Values may be either 'yes'
798              or 'no', with the default being 'no'.
799              Command line argument: -T
800
801GMTTime       This keyword allows timestamps to be displayed in GMT (UTC)
802              time instead of local time.  Normally The Webalizer will
803              display timestamps in the time-zone of the local machine
804              (ie: PST or EDT).  This keyword allows you to specify the
805              display of timestamps in GMT (UTC) time instead.  Values
806              may be either 'yes' or 'no'.  Default is 'no'.
807
808Debug         This tells The Webalizer to display additional information
809              when it encounters Warnings or Errors.  Normally, The
810              Webalizer will just tell you it found a bad record or
811              field.  This option will enable the display of the actual
812              data that produced the Warning or Error as well.  Useful
813              only if you start getting lots of Warnings or Errors and
814              want to determine the cause.  Values may be either 'yes'
815              or 'no', with the default being 'no'.
816              Command line argument: -d
817
818IgnoreHist    This suppresses the reading of a history file.  USE WITH
819              EXTREME CAUTION as the history file is how The Webalizer
820              keeps track of previous months.  The effect of this option
821              is as if The Webalizer was being run for the very first
822              time, and any previous data is discarded.  Values may be
823              either 'yes' or 'no', with the default being 'no'.
824              Command line argument: -i
825
826IgnoreState   This suppresses the reading of an existing incremental
827              data file.  USE WITH EXTREME CAUTION!  By ignoring an
828              existing incremental data file, all previous processing
829              for the current month will be lost, and those logs must
830              be re-processed.  Values may be 'yes' or 'no', with the
831              default being 'no'.
832              Command line argument: -b
833
834FoldSeqErr    Allows log records that are out of sequence to be folded
835              back into the analysis, by treating them as if they had
836              the same date/time as the last good record.  Normally,
837              out of sequence log records are simply ignored.  If you
838              run apache, don't worry about this.
839
840VisitTimeout  Set the 'visit timeout' value.  Visits are determined by
841              looking at the time difference between the current and last
842              request made by a specific site.  If the difference in time
843              is greater than the visit timeout value, the request is
844              considered a new visit.  The value is in number of seconds,
845              and defaults to 30 minutes (1800).
846              Command line argument: -m
847
848PageType      Allows you to define the 'page' type extension.  Normally,
849              people consider HTML and CGI scripts as 'pages'.  This
850              option allows you to specify what extensions you consider
851              a page.  Default is 'htm*' and 'cgi' for web logs, and
852              'txt' for ftp logs.
853              Command line argument: -P
854
855PagePrefix    Allows all requests with a specified prefix to be considered
856              as 'pages'. If you want everything under /documents to be
857              treated as pages no matter what their extension is. Also
858              useful if you have cgi-scripts with PATH_INFO.
859
860OmitPage      Allows specified URLs to not be counted as pages under any
861              circumstance, even if they have an extension matching a
862              PageType or PagePrefix as defined above.
863
864GraphLegend   Enable/disable the display of color coded legends on the
865              produced graphs.  Default is 'yes', to display them.
866              Command line argument: -L
867
868GraphLines    Specify the number of background reference lines to display
869              on produced graphs.  The default is 2.  To disable the use
870              of background lines, use zero ('0').
871              Command line argument: -l
872
873IndexMonths   Specify the number of months to display in the main index
874              (yearly summary) table.  Default is 12 months.  Can be set
875              to anything between 12 and 120 months (1 to 10 years).
876              Command line argument: -K
877
878YearHeaders   Enable/disable the display of year headers in the main index
879              (yearly summary) table.  If enabled, year headers will be
880              shown when the table is displaying more than 16 months worth
881              of data.  Values can be 'yes' or 'no'.  Default is 'yes'.
882
883GraphMonths   Specify the number of months to display in the main index
884              (yearly summary) graph.  Default is 12 months.  Can be set
885              to anything between 12 and 72 months (1 to 6 years).
886              Command line argument: -k
887
888CountryGraph  This keyword is used to either enable or disable the creation
889              and display of the Country Usage graph.  Values may be either
890              'yes' or 'no', with the default being 'yes'.
891              Command line argument: -Y
892
893CountryFlags  Enables or disables the display of flags in the top country
894              table.  If enabled, the default directory 'flags' directly
895              under the output directory will be used unless a different
896              path is specified with the 'FlagDir' option below.
897              Command line argument: -zflags
898
899FlagDir       Specifies the location of flag graphics.  If not specified,
900              the default is in the 'flags' directory directly under the
901              output directory being used for the reports.  If specified,
902              the display of flags will be enabled by default.
903              Command line argument: -z
904
905DailyGraph    This keyword is used to either enable or disable the creation
906              and display of the Daily Usage graph.  Values may be either
907              'yes' or 'no', with the default being 'yes'.
908
909DailyStats    This keyword is used to either enable or disable the creation
910              and display of the Daily Usage statistics table.  Values may
911              be either 'yes' or 'no', with the default being 'yes'.
912
913HourlyGraph   This keyword is used to either enable or disable the creation
914              and display of the Hourly Usage graph.  Values may be either
915              'yes' or 'no', with the default being 'yes'.
916              Command line argument: -G
917
918HourlyStats   This keyword is used to either enable or disable the creation
919              and display of the Hourly Usage statistics table.  Values may
920              be either 'yes' or 'no', with the default being 'yes'.
921              Command line argument: -H
922
923IndexAlias    This allows additional 'index.html' aliases to be defined.
924              Normally, The Webalizer scans for and strips the string
925              "index." from URLs before processing them (unless disabled
926              using the DefaultIndex config option below).  This turns a
927              URL such as /somedir/index.html into just /somedir/ which
928              is really the same URL.  This keyword allows _additional_
929              names to be treated in the same fashion for sites that use
930              different default names, such as "home.html".  The string
931              is scanned for anywhere in the URL, so care should be used
932              if and when you define additional aliases.  For example,
933              if you were to use an alias such as 'home', the URL
934              /somedir/homepages/brad/home.html would be turned into just
935              /somedir/ which probably isn't the intended result.  Instead,
936              you should have specified 'home.htm' which would correctly
937              turn the URL into /somedir/homepages/brad/ like intended.
938              It should also be noted that specified aliases are scanned
939              for in EVERY log record... A bunch of aliases will noticeably
940              degrade performance as each record has to be scanned for
941              every alias defined.  You don't have to specify 'index.' as
942              it is always the default (unless disabled with the config
943              option "DefaultIndex" described below).
944              Command line argument: -I
945
946DefaultIndex  This option is used to enable/disable the use of "index." as
947              a default index name to be stripped from the end of a URL.
948              Most sites should not need to use this option, however some
949              may find it useful, particularly those whose default index
950              file name is something different, or those sites that use
951              'index.php' or similar URLs to generate dynamic content.
952              This option does not effect any of the names that may be
953              defined using the IndexAlias option, and those names will
954              still function as described.  Values may be 'yes' or 'no',
955              with 'yes' being the default.
956
957MangleAgents  The MangleAgents keyword specifies the level of user agent
958              name mangling, if any.  There are 6 levels that may be specified,
959              each producing a different level of detail displayed.  Level 5
960              displays only the browser name (MSIE or Mozilla) and the major
961              version number.  Level 4 adds the minor version (single
962              decimal place).  Level 3 adds the minor version to two decimal
963              places.  Level 2 will also add any sub-level designation
964              (such as Mozilla/3.01Gold or MSIE 3.0b).  Level 1 will also
965              attempt to add the system type.  The default level 0 will
966              leave the user agent field unmodified and produces the
967              greatest amount of detail.
968              Command line argument: -M
969
970SearchEngine  This keyword allows specification of search engines and
971              their query strings.  Search strings are obtained from
972              the referrer field in the record, and in order to work
973              properly, the Webalizer needs to know what query strings
974              different search engines use.  The SearchEngine allows
975              you to specify the search engine and its query string
976              to parse the search string from.  The line is formatted
977              as:  "SearchEngine engine-string query-string"  where
978              'engine-string' is a substring for matching the search
979              engine with, such as "yahoo.com" or "altavista".  The
980              'query-string' is the unique query string that is added
981              to the URL for the search engine, such as "search=" or
982              "MT=" with the actual search strings appended to the
983              end.  There is no command line option for this keyword.
984
985SearchCaseI   The SearchCaseI option specifies if search strings should
986              be lowercased (case insensitive) or not.  Since most
987              search engines use case insensitive searches (ie: a
988              search for "Hello" is the same as "HELLO" or "hello"),
989              converting to lowercase will improve keyword accuracy,
990              which is the default.  If desired, case sensitivity can
991              be forced with this option.  The value can be 'yes' or
992              'no', with 'yes' (case insensitive) being the default.
993
994Incremental   This allows incremental processing to be enabled or disabled.
995              Incremental processing allows processing partial logs without
996              the loss of detail data from previous runs in the same month.
997              This feature saves the 'internal state' of the program so that
998              it may be restored in following runs.  See the section above
999              titled "Incremental Processing" for additional information.
1000              The value may be 'yes' or 'no', with the default being 'no'.
1001              Command line argument: -p
1002
1003IncrementalName
1004              Allows specification of the incremental data filename if
1005              desired.  Normally, the file named "webalizer.current' is
1006              used, kept in the standard output directory.  If specified,
1007              filenames are relative to the standard output directory,
1008              unless an absolute name is given (ie: starts with '/').
1009
1010StripCGI      Determines if CGI variables should be stripped from the
1011              end of URLs or not.  Normally, these variables are removed
1012              from URLs to improve accuracy, however some sites may wish
1013              to keep them preserved (particularly on highly dynamic
1014              sites).  Values may be either 'yes' or 'no', with 'yes'
1015              being the default.
1016
1017TrimSquidURL  Allows squid log URLs to be reduced in granularity by
1018              truncating them after a specified number of '/' path
1019              separators after the http:// portion.  A value of 1 will
1020              cause all URLs to be summarized by domain only.  The
1021              default value is zero (0), which leaves URLs unmodified.
1022
1023DNSCache      Specifies the DNS cache filename.  This name is relative
1024              to the default output directory unless an absolute name
1025              is given (ie: starts with '/').  See the DNS.README file
1026              for additional information.
1027              Command line argument: -D
1028
1029DNSChildren   The number of DNS children processes to run in order to
1030              create/update the DNS cache file.  If specified, the DNS
1031              cache filename must also be specified (see above).  Use
1032              a value of zero ('0') to disable.  See the DNS.README
1033              file for additional information.
1034              Command line argument: -N
1035
1036CacheIPs      Specifies if unresolved addresses should also be cached
1037              in the DNS database.  If enabled, unresolved IP addresses
1038              will be stored along with resolved addresses.  This may
1039              be useful on some sites that have lots of unresolved IPs
1040              visiting so they are not looked up each time the program
1041              is run.  Values may be 'yes' or 'no'.  Default is 'no'.
1042
1043CacheTTL      Specifies the Time To Live (TTL) value for cached DNS
1044              entries in days.  Default value is 7 (1 week).  Can be
1045              any value between 1 and 100.
1046
1047GeoDB         Controls the use of the native GeoDB geolocation services
1048              provided by The Webalizer.  Values may be 'yes' or 'no'
1049              with 'no' being the default.
1050              Command line argument: -j
1051
1052GeoDBDatabase Specifies and alternate GeoDB database filename to use.
1053              This is relative to the output directory being used unless
1054              an absolute path is given (ie: starts with a '/').
1055              Command line argument: -J
1056
1057GeoIP         Controls the use of GeoIP geolocation services.  If The
1058              Webalizer was compiled with GeoIP support, it is used by
1059              default.  Values may be 'yes' or 'no'. Default is 'yes'.
1060              Command line argument: -w
1061
1062GeoIPDatabase Specifies an alternate GeoIP database filename to use.
1063              This name is relative to the default output directory
1064              unless an absolute name is given (ie: starts with '/').
1065              Command line argument: -W
1066
1067
1068Top Table Keywords
1069------------------
1070
1071TopAgents     This allows you to specify how many "Top" user agents are
1072              displayed in the "Top User Agents" table.  The default
1073              is 15.  If you do not want to display user agent statistics,
1074              specify a value of zero (0).  The display of user agents
1075              will only work if your web server includes this information
1076              in its log file (ie: a combined log format file).
1077              Command line argument: -A
1078
1079AllAgents     Will cause a separate HTML page to be generated for all
1080              normally visible User Agents.  A link will be added to
1081              the bottom of the "Top User Agents" table if enabled.
1082              Value can be either 'yes' or 'no', with 'no' being the
1083              default.
1084
1085TopCountries  This allows you to specify how many "Top" countries are
1086              displayed in the "Top Countries" table.  The default is
1087              30.  If you want to disable the countries table, specify
1088              a value of zero (0).
1089              Command line argument: -C
1090
1091TopReferrers  This allows you to specify how many "Top" referrers are
1092              displayed in the "Top Referrers" table.  The default is
1093              30.  If you want to disable the referrers table, specify
1094              a value of zero (0).  The display of referrer information
1095              will only work if your web server includes this information
1096              in its log file (ie: a combined log format file).
1097              Command line argument: -R
1098
1099AllReferrers  Will cause a separate HTML page to be generated for all
1100              normally visible Referrers.  A link will be added to the
1101              "Top Referrers" table if enabled.  Value can be either
1102              'yes' or 'no', with 'no' being the default.
1103
1104TopSites      This allows you to specify how many "Top" sites are
1105              displayed in the "Top Sites" table.  The default is 30.
1106              If you want to disable the sites table, specify a value
1107              of zero (0).
1108              Command line argument: -S
1109
1110TopKSites     Identical to TopSites, except for the 'by KByte' table.
1111              Default is 10.  No command line switch for this one.
1112
1113AllSites      Will cause a separate HTML page to be generated for all
1114              normally visible Sites.  A link will be added to the
1115              bottom of the "Top Sites" table if enabled.  Value can
1116              be either 'yes' or 'no', with 'no' being the default.
1117
1118TopURLs       This allows you to specify how many "Top" URLs are
1119              displayed in the "Top URLs" table.  The default is 30.
1120              If you want to disable the URLs table, specify a value
1121              of zero (0).
1122              Command line argument: -U
1123
1124TopKURLs      Identical to TopURLs, except for the 'by KByte' table.
1125              Default is 10.  No command line switch for this one.
1126
1127AllURLs       Will cause a separate HTML page to be generated for all
1128              normally visible URLs.  A link will be added to the
1129              bottom of the "Top URLs" table if enabled.  Value can
1130              be either 'yes' or 'no', with 'no' being the default.
1131
1132TopEntry      Allows you to specify how many "Top Entry Pages" are
1133              displayed in the table.  The default is 10.  If you
1134              want to disable the table, specify a value of zero (0).
1135              Command line argument: -e
1136
1137TopExit       Allows you to specify how many "Top Exit Pages" are
1138              displayed in the table.  The default is 10.  If you
1139              want to disable the table, specify a value of zero (0).
1140              Command line argument: -E
1141
1142TopSearch     Allows you to specify how many "Top Search Strings" are
1143              displayed in the table.  The default is 20.  If you
1144              want to disable the table, specify a value of zero (0).
1145              Only works if using combined log format (ie: contains
1146              referrer information).
1147
1148TopUsers      This allows you to specify how many "Top" usernames are
1149              displayed in the "Top Usernames" table.  Usernames are
1150              only available if you use http authentication on your
1151              web server, or when processing wu-ftpd xferlogs.  The
1152              default value is 20.  If you want to disable the Username
1153              table, specify a value of zero (0).
1154
1155AllUsers      Will cause a separate HTML page to be generated for all
1156              normally visible usernames.  A link will be added to the
1157              bottom of the "Top Usernames" table if enabled.  Value
1158              can be either 'yes' or 'no', with 'no' being the default.
1159
1160AllSearchStr  Will create a separate HTML page to be generated for all
1161              normally visible Search Strings.  A link will be added
1162              to the bottom of the "Top Search Strings" table if
1163              enabled.  Value can be either 'yes' or 'no', with 'no'
1164              being the default.
1165
1166
1167Hide Object Keywords
1168--------------------
1169
1170These keywords allow you to hide user agents, referrers, sites, URLs
1171and usernames from the various "Top" tables.  The value for these keywords
1172are the same as those used in their command line counterparts.  You
1173can specify as many of these as you want without limit.  Refer to the
1174section above on "Command Line Options" for a description of the string
1175formatting used as the value.  Values cannot exceed 80 characters in
1176length.
1177
1178HideAgent     This allows specified user agents to be hidden from the
1179              "Top User Agents" table.  Not very useful, since there
1180              a zillion different names by which browsers go by today,
1181              but could be useful if there is a particular user agent
1182              (ie: robots, spiders, real-audio, etc..) that hits your
1183              site frequently enough to make it into the top user agent
1184              listing.  This keyword is useless if 1) your log file does
1185              not provide user agent information or 2) you disable the
1186              user agent table.
1187              Command line argument: -a
1188
1189HideReferrer  This allows you to hide specified referrers from the
1190              "Top Referrers" table.  Normally, you would only specify
1191              your own web server to be hidden, as it is usually the
1192              top generator of references to your own pages.  Of course,
1193              this keyword is useless if 1) your log file does not include
1194              referrer information or 2) you disable the top referrers
1195              table.
1196              Command line argument: -r
1197
1198HideSite      This allows you to hide specified sites from the "Top
1199              Sites" table.  Normally, you would only specify your own
1200              web server or other local machines to be hidden, as they
1201              are usually the highest hitters of your web site, especially
1202              if you have their browsers home page pointing to it.
1203              Command line argument: -s
1204
1205HideAllSites  This allows hiding all individual sites from the display,
1206              which can be useful when a lot of groupings are being
1207              used (since grouped records cannot be hidden).  It is
1208              particularly useful in conjunction with the GroupDomain
1209              feature, however can be useful in other situations as well.
1210              Value can be either 'yes' or 'no', with 'no' the default.
1211              Command line argument: -X
1212
1213HideURL       This allows you to hide URLs from the "Top URLs" table.
1214              Normally, this is used to hide items such as graphic files,
1215              audio files or other 'non-html' files that are transferred
1216              to the visiting user.
1217              Command line argument: -u
1218
1219HideUser      This allows you to hide Usernames from the "Top Usernames"
1220              table.  Usernames are only available if you use http based
1221              authentication on your web server.
1222
1223
1224Group Object Keywords
1225---------------------
1226
1227The Group* keywords allow object grouping based on Site, URL, Referrer,
1228User Agent and Usernames.  Combined with the Hide* keywords, you can
1229customize exactly what will be displayed in the 'Top' tables.  For example,
1230to only display totals for a particular directory, use a GroupURL and
1231HideURL with the same value (ie: '/help/*').  Group processing is only
1232done after the individual record has been fully processed, so name mangling
1233and site total updates have already been performed.  Because of this, groups
1234are not counted in the main site total (as that would cause duplication).
1235Groups can be displayed in bold and shaded as well.  Grouped records are
1236not, by default, hidden from the report.  This allows you to display a
1237grouped total, while still being able to see the individual records, even
1238if they are part of the group.  If you want to hide the detail records,
1239follow the Group* directive with a Hide* one using the same value.  There
1240are no command line switches for these keywords.  The Group* keywords also
1241accept an optional label to be displayed instead of the actual value used.
1242This label should be separated from the value by at least one whitespace
1243character, such as a space or tab character.  If the match string contains
1244whitespace (spaces or tabs),  the string should be quoted, using either
1245single or double quotes.  See the sample configuration file for examples.
1246
1247GroupReferrer Allows grouping Referrers.  Can be handy for some of the
1248              major search engines that have multiple host names a
1249              referral could come from.
1250
1251GroupURL      This keyword allows grouping URLs. Useful for grouping
1252              complete directory trees.
1253
1254GroupSite     This keywords allows grouping Sites.  Most used for
1255              grouping top level domains and unresolved IP address
1256              for local dial-ups, etc...
1257
1258GroupAgent    Groups User Agents.  A handy example of how you could use
1259              this one is to use "Mozilla" and "MSIE" as the values for
1260              GroupAgent and HideAgent keywords.  Make sure you put the
1261              "MSIE" one first.
1262
1263GroupDomains  Allows automatic grouping of domains.  The numeric value
1264              represents the level of grouping, and can be thought of
1265              as 'the number of dots' to display.  A 1 will display
1266              second level domains only (xxx.xxx), a 2 will display
1267              third level domains (xxx.xxx.xxx) etc...  The default
1268              value of 0 disables any domain grouping.
1269              Command line argument: -g
1270
1271GroupUser     Allows grouping of usernames.  Combined with a group
1272              name, this can be handy for displaying statistics on
1273              a particular group of users without displaying their
1274              real usernames.
1275
1276GroupShading  Allows shading of table rows for groups.  Value can be
1277              'yes' or 'no', with the default being 'yes'.
1278
1279GroupHighlight Allows bolding of table rows for groups.  Value can be
1280               'yes' or 'no', with the default being 'yes'.
1281
1282
1283Ignore/Include Object Keywords
1284----------------------
1285
1286These keywords allow you to completely ignore log records when generating
1287statistics, or to force their inclusion regardless of ignore criteria.
1288Records can be ignored or included based on site, URL, user agent, referrer
1289and username.  Be aware that by choosing to ignore records, the accuracy of
1290the generated statistics become skewed, making it impossible to produce
1291an accurate representation of load on the web server.  These keywords
1292behave identical to the Hide* keywords above, where the value can have
1293a leading or trailing wildcard '*'.  These keywords, like the Hide* ones,
1294have an absolute limit of 80 characters for their values.  These keywords
1295do not have any command line switch counterparts, so they may only be
1296specified in a configuration file.  It should also be pointed out that
1297using the Ignore/Include combination to selectively exclude an entire
1298site while including a particular 'chunk' is _extremely_ inefficient,
1299and should be avoided.  Try grep'ing the records into a separate file
1300and process it instead.
1301
1302IgnoreSite    This allows specified sites to be completely ignored from
1303              the generated statistics.
1304
1305IgnoreURL     This allows specified URLs to be completely ignored from
1306              the generated statistics.  One use for this keyword would
1307              be to ignore all hits to a 'temporary' directory where
1308              development work is being done, but is not accessible to
1309              the outside world.
1310
1311IgnoreReferrer This allows records to be ignored based on the referrer
1312               field.
1313
1314IgnoreAgent   This allows specified User Agent records to be completely
1315              ignored from the statistics.  Maybe useful if you really
1316              don't want to see all those hits from MSIE :)
1317
1318IgnoreUser    This allows specified username records to be completely
1319              ignored from the statistics.  Usernames can only be used
1320              if you use http authentication on your server.
1321
1322IncludeSite   Force the record to be processed based on hostname.  This
1323              takes precedence over the Ignore* keywords.
1324
1325IncludeURL    Force the record to be processed based on URL.  This takes
1326              precedence over the Ignore* keywords.
1327
1328IncludeReferrer Force the record to be processed based on referrer.
1329                This takes precedence over the Ignore* keywords.
1330
1331IncludeAgent  Force the record to be processed based on user agent.
1332              This takes precedence over the Ignore* keywords.
1333
1334IncludeUser   Force the record to be processed based on username.
1335              Usernames are only available if you use http based
1336              authentication on your server.  This takes precedence over
1337              the Ignore* keywords.
1338
1339
1340Dump Object Keywords
1341--------------------
1342
1343The Dump* Keywords allow text files to be generated that can then be used
1344for import into most database, spreadsheet and other external programs.
1345The file is a standard tab delimited text file, meaning that each column
1346is separated by a tab (0x09) character.  A header record may be included
1347if required, using the 'DumpHeader' keyword.  Since these files contain
1348all records that have been processed, including normally hidden records,
1349an alternate location for the files can be specified using the 'DumpPath'
1350keyword, otherwise they will be located in the default output directory.
1351
1352DumpPath      Specifies an alternate location for the dump files.  The
1353              default output location will be used otherwise.  The value
1354              is the path portion to use, and normally should be an
1355              absolute path (ie: has a leading '/' character), however
1356              relative path names can be used as well, and will be
1357              relative to the output directory location.
1358
1359DumpExtension Allows the dump filename extensions to be specified. The
1360              default extension is "tab", however may be changed with
1361              this option.
1362
1363DumpHeader    Allows a header record to be written as the first record
1364              of the file.  Value can be either 'yes' or 'no',  with
1365              the default being 'no'.
1366
1367DumpSites     Dump tab delimited sites file.  Value can be either 'yes'
1368              or 'no', with the default being 'no'.   The filename used
1369              is site_YYYYMM.tab (YYYY=year, MM=month).
1370
1371DumpURLs      Dump tab delimited url file.  Value can be either 'yes' or
1372              'no', with the default being 'no'.  The filename used is
1373              url_YYYYMM.tab (YYYY=year, MM=month).
1374
1375DumpReferrers Dump tab delimited referrer file.  Value can be either
1376              'yes' or 'no', with the default being 'no'.  Filename
1377              used is ref_YYYYMM.tab (YYYY=year, MM=month).  Referrer
1378              information is only available if present in the log
1379              file (ie: combined web server log).
1380
1381DumpAgents    Dump tab delimited user agent file.  Value can be either
1382              'yes' or 'no', with the default being 'no'.  Filename
1383              used is agent_YYYYMM.tab (YYYY=year, MM=month).  User
1384              agent information is only available if present in the
1385              log file (ie: combined web server log).
1386
1387DumpUsers     Dump tab delimited username file.  Value can be either
1388              'yes' or 'no', with the default being 'no'.  Filename
1389              used is user_YYYYMM.tab (YYYY=year, MM=month).  The
1390              username data is only available if processing a wu-ftpd
1391              xferlog or http authentication is used on the web server
1392              and that information is present in the log.
1393
1394DumpSearchStr Dump tab delimited search string file.  Value can be
1395              either 'yes' or 'no', with the default being 'no'.
1396              Filename used is search_YYYYMM.tab (YYYY=year, MM=month).
1397              the search string data is only available if referrer
1398              information is present in the log being processed and
1399              recognized search engines were found and processed.
1400
1401
1402
1403HTML Generation Keywords
1404------------------------
1405
1406These keywords allow you to customize the HTML code that The Webalizer
1407produces, such as adding a corporate logo or links to other web pages.
1408You can specify as many of these keywords as you like, and they will be
1409used in the order that they are found in the file.  Values cannot exceed
141080 characters in length, so you may have to break long lines up into two
1411or more lines.  There are no command line counterparts to these keywords.
1412
1413HTMLExtension  Allows generated pages to use something other than the
1414               default 'html' extension for the filenames.  Do not
1415               include the leading period ('.') when you specify the
1416               extension.
1417               Command line argument: -x
1418
1419HTMLPre        Allows code to be inserted at the very beginning of the
1420               HTML files.  Defaults to the standard HTML 3.2 DOCTYPE
1421               record.  Be careful not to include any HTML here, as it
1422               is inserted _before_ the <HTML> tag in the file.  Use it
1423               for server-side scripting capabilities, such as php3, to
1424               insert scripting files and other directives.
1425
1426HTMLHead       Allows you to insert HTML code between the <HEAD></HEAD>
1427               block.  There is no default.  Useful for adding scripts
1428               to the HTML page, such as Javascript or php3, or even
1429               just for adding a few META tags to the document.
1430
1431HTMLBody       This keyword defines HTML code to be placed immediately
1432               after the <HEAD> section of the report, just before the
1433               title and "summary period/generated on" lines.  If used,
1434               the first HTMLHead line MUST include a <BODY> tag.  Put
1435               whatever else you want in subsequent lines, but keep in
1436               mind the placement of this code in relation to the title
1437               and other aspects of the web page.  Some typical uses
1438               are to change the page colors and possibly add a corporate
1439               logo (graphic) in the top right.  If not specified, a
1440               default <BODY> tag is used that defines page color, text
1441               color and link colors (see "sample.conf" file for example).
1442
1443HTMLPost       This keyword defines HTML code that is placed after the
1444               title and "summary period/generated on" lines, just before
1445               the initial horizontal rule <HR> tag.  Normally this keyword
1446               isn't needed, but is provided in case you included a large
1447               graphic or some other weird formatting tag in the HTMLHead
1448               section that needs to be cleaned up or terminated before the
1449               main report section.
1450
1451HTMLTail       This keyword defines HTML code that is placed at the bottom
1452               right side of the report.  It is inserted in a <TABLE> section
1453               between table data <TD>..</TD> tags, and is top and right
1454               aligned within the table.  Normally this keyword is used to
1455               provide a link back to your home page or insert a small
1456               graphic at the bottom right of the page.
1457
1458HTMLEnd        This allows insertion of closing code, at the very end of
1459               the page.  The default is to put the closing </BODY> and
1460               </HTML> tags.  If specified, you _must_ specify these tags
1461               yourself.
1462
1463LinkReferrer   This specifies if the referrers listed in the top referrer
1464               table should be displayed as plain text, or as a link to the
1465               referrer.  Values can be either 'yes' or 'no', with 'no'
1466               being the default.
1467
1468
1469Graph Color Commands
1470--------------------
1471
1472These keywords allow altering the colors used in the various graphs
1473produced by the Webalizer.  The value is specified as a standard HTML
1474RGB hexdecimal color string, without the leading '#' character.  The
1475value is case insensitive.  If not specified, the default color shown
1476will be used.
1477
1478ColorHit      Color used for 'Hits'.   Default is '00805C' (green)
1479
1480ColorFile     Color used for 'Files'.  Default is '0040FF' (blue)
1481
1482ColorSite     Color used for 'Sites'.  Default is 'FF8000' (orange)
1483
1484ColorKbyte    Color used for 'KBytes'. Default is 'FF0000' (red)
1485
1486ColorPage     Color used for 'Pages'.  Default is '00E0FF' (cyan)
1487
1488ColorVisit    Color used for 'Visits'. Default is 'FFFF00' (yellow)
1489
1490ColorMisc     Color used for miscellaneous titles in various 'Top'
1491              tables (not graphs).     Default is '00E0FF' (cyan)
1492
1493PieColor1     Pie Chart color #1.      Default is '800080' (purple)
1494
1495PieColor2     Pie Chart color #2.      Default is '80FFC0' (lt. green)
1496
1497PieColor3     Pie Chart color #3.      Default is 'FF00FF' (lt. purple)
1498
1499PieColor4     Pie Chart color #4.      Default is 'FFC080' (tan)
1500
1501
1502--------------------------------------------------------------------------
1503
1504
1505Notes on Web Log Files
1506----------------------
1507
1508The Webalizer supports CLF log formats, which should work for just
1509about everyone.  If you want User Agent or Referrer information, you
1510need to make sure your web server supplies this information in its
1511log file, and in a format that the Webalizer can understand.  While
1512The Webalizer will try to handle many of the subtle variations in
1513log formats, some will not work at all.   Most web servers output
1514CLF format logs by default.  For Apache, in order to produce the
1515proper log format, add the following to the httpd.conf file:
1516
1517LogFormat "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-agent}i\""
1518
1519This instructs the Apache web server to produce a 'combined' log
1520that includes the referrer and user agent information on the end of
1521each record, enclosed in quotes (This is the standard recommended
1522by both Apache and NCSA).   Netscape and other web servers have
1523similar capabilities to alter their log formats.  (note: the above
1524works for apache servers up to V1.2.  V1.3 and higher now have additional
1525ways to specify log formats... refer to included documentation).
1526
1527Notes on FTP Log Files
1528----------------------
1529
1530The Webalizer supports ftp logs produced by wu-ftpd, proftpd and others,
1531as a standard 'xferlog'.  To process an ftp log, you must either use the
1532-Ff command line option or have "LogType ftp" in your configuration file.
1533It is recommended that you create a separate configuration file for ftp
1534analysis, since the values used for your web server will most likely not
1535be suited for ftp log analysis (ie: page types, hostname, etc.. should
1536be different).
1537
1538Because of the difference in web and ftp logs, there are a few limitations:
1539
1540o Because there is no concept of a 'response code' in ftp world, response
1541  codes are restricted to either 200 (OK) or 206 (Partial Content), based
1542  on the completion status found in xferlog (for wu-ftpd, 'i'=incomplete
1543  and will generate a 206, 'c'=complete and will generate a 200).  If your
1544  ftp server doesn't supply the completion status, all requests will be
1545  assigned a response code of 200.  This allows the usage graph to display
1546  all transfer requests (hits), and how many of those completed in success
1547  (files - ie: 200 response codes).
1548
1549o Page totals won't accurately reflect reality, since there isn't really
1550  the concept of a 'page' in regards to ftp services.  I have found that
1551  setting the PageType value to "README", "FIRST", etc... seems to work
1552  fairly well however,  and will give a pretty good indication of how
1553  many 'non-binary' files were requested.  Of course, the content of your
1554  ftp site will be different, so your results may vary.
1555
1556o Visit totals also won't accurately reflect reality, since visits are
1557  triggered on PageType requests (see above).  What you usually wind up
1558  with is visits=sites in most cases.
1559
1560o Entry/Exit pages will not be calculated for ftp logs.
1561
1562o For obvious reasons, referrers and user agents are not supported.
1563
1564o You _cannot_ analyze both web and ftp logs at the same time.. they must
1565  be done separately in different runs.
1566
1567
1568Notes on Referrers
1569------------------
1570
1571Referrers are weird critters... They take many shapes and forms, which makes
1572it much harder to analyze than a typical URL, which at least has some
1573standardization.  What is contained in the referrer field of your log
1574files varies depending on many factors, such as what site did the referral,
1575what type of system it comes from and how the actual referral was generated.
1576Why is this?  Well, because a user can get to your site in many ways... They
1577may have your site bookmarked in their browser, they may simply type your
1578sites URL field in their browser, they could have clicked on a link on some
1579remote web page or they may have found your site from one of the many search
1580engines and site indexes found on the web.  The Webalizer attempts to deal
1581with all this variation in an intelligent way by doing certain things to
1582the referrer string which makes it easier to analyze.  Of course, if your
1583web server doesn't provide referrer information, you probably don't really
1584care and are asking yourself why you are reading this section...
1585
1586Most referrers will take the form of "http://somesite.com/somepage.html",
1587which is what you will get if the user clicks on a link somewhere on the
1588web in order to get to your site.  Some will be a variation of this, and
1589look something like "file:/some/such/sillyname", which is a reference from
1590a HTML document on the users local machine.  Several variations of this can
1591be used, depending on what type of system the user has, if he/she is on
1592a local network, the type of network, etc...  To complicate things even
1593more, dynamic HTML documents and HTML documents that are generated by
1594CGI scripts or external programs produce lots of extra information which
1595is tacked on to the end of the referrer string in an almost infinite number
1596of ways.  If the user just typed your URL into their browser or clicked on
1597a bookmark, there won't be any information in the referrer field and will
1598take the form "-".
1599
1600In order to handle all these variations, The Webalizer parses the referrer
1601field in a certain way.  First, if the referrer string begins with "http",
1602it assumes it is a normal referral and converts the "http://" and following
1603hostname to lowercase in order to simplify hiding if desired.  For example,
1604the referrer "HTTP://WWW.MyHost.Com/This/Is/A/HTML/Document.html" will become
1605"http://www.myhost.com/This/Is/A/HTML/Document.html".  Notice that only the
1606"http://" and hostname are converted to lower case... The rest of the
1607referrer field is left alone.  This follows standard convention, as the
1608actual method (HTTP) and hostname are always case insensitive, while the
1609document name portion is case sensitive.
1610
1611Referrers that came from search engines, dynamic HTML documents, CGI
1612scripts and other external programs usually tack on additional information
1613that it used to create the page.  A common example of this can be found
1614in referrals that come from search engines and site indexes common on the
1615web.  Sometimes, these referrers URLs can be several hundred characters
1616long and include all the information that the user typed in to search for
1617your site.  The Webalizer deals with this type of referrer by stripping
1618off all the query information, which starts with a question mark '?'.
1619The Referrer "http://search.yahoo.com/search?p=usa%26global%26link" will
1620be converted to just "http://search.yahoo.com/search".
1621
1622When a user comes to your site by using one of their bookmarks or by
1623typing in your URL directly into their browser, the referrer field is
1624blank, and looks like "-".  Most sites will get more of these referrals
1625than any other type.  The Webalizer converts this type of referral into
1626the string "- (Direct Request)".  This is done in order to make it easier
1627to hide via a command line option or configuration file option.  This is
1628because the character "-" is a valid character elsewhere in a referrer
1629field, and if not turned into something unique, could not be hidden without
1630possibly hiding other referrers that shouldn't be.
1631
1632
1633Notes on Character Escaping
1634---------------------------
1635
1636The HTTP protocol defines certain ways that URLs can look and behave.  To
1637some extent, referrer fields follow most of the same conventions.  Character
1638escaping is a technique by which non-printable or other non-ASCII (and even
1639some ASCII) characters can be used in a URL.  This is done by placing the
1640Hexadecimal value of the character in the URL, preceded by a percent sign '%'.
1641Since Hex values are made up of ASCII characters, any character can be
1642escaped to ensure only printable ASCII characters are present in the URL.
1643Some systems take this concept to the extreme and escape all sorts of stuff,
1644even characters that don't need to be escaped.  To deal with this, The
1645Webalizer will un-escape URLs and referrers before being processed. For
1646Example, the URL "/www.webalizer.org/%7Efoo/bar.html" is the same URL as
1647"/www.webalizer.org/~foo/bar.html", a very common form of a URL to access
1648users web pages.  If the URLs were not un-escaped, they would be treated as
1649two separate documents, even though they are really one and the same.
1650
1651
1652Search String Analysis
1653----------------------
1654
1655  The Webalizer will do a minimal analysis on referrer strings that
1656it finds, looking for well known search string patterns.  Most of
1657the major search engines are supported, such as Yahoo!, Altavista,
1658Lycos, etc...  Unfortunately, search engines are always changing
1659their internal/CGI query formats, new search engines are coming on
1660line every day, and the ability to detect _all_ search strings is
1661nearly impossible.  However, it should be accurate enough to give
1662a good indication of what users were searching for when they stumbled
1663across your site.  Note: as of version 1.31, search engines can now
1664be specified within a configuration file.  See the sample.conf file
1665for examples of how to specify additional search engines.
1666
1667
1668
1669Notes on Visits/Entry/Exit Figures
1670----------------------------------
1671
1672The majority of data analyzed and reported on by The Webalizer is
1673as accurate and correct as possible based on the input log file.
1674However, due to the limitation of the HTTP protocol, the use of
1675firewalls, proxy servers, multi-user systems, the rotation of your
1676log files, and a myriad of other conditions, some of these numbers
1677cannot, without absolute accuracy, be calculated.  In particular,
1678Visits, Entry Pages and Exit Pages are suspect to random errors
1679due to the above and other conditions.  The reason for this is
1680twofold, 1) Log files are finite in size and time interval, and
16812) There is no way to distinguish multiple individual users apart
1682given only an IP address.  Because log files are finite, they have
1683a beginning and ending, which can be represented as a fixed time
1684period.  There is no way of knowing what happened previous to this
1685time period, nor is it possible to predict future events based on
1686it.  Also, because it is impossible to distinguish individual users
1687apart, multiple users that have the same IP address all appear to
1688be a single user, and are treated as such.  This is most common where
1689corporate users sit behind a proxy/firewall to the outside world,
1690and all requests appear to come from the same location (the address
1691of the proxy/firewall itself).  Dynamic IP assignment (used with
1692dial-up Internet accounts) also present a problem, since the same
1693user will appear as to come from multiple places.
1694
1695For example, suppose two users visit your server from XYZ company,
1696which has their network connected to the Internet by a proxy server
1697'fw.xyz.com'.  All requests from the network look as though they
1698originated from 'fw.xyz.com', even though they were really initiated
1699from two separate users on different PCs.  The Webalizer would
1700see these requests as from the same location, and would record only
17011 visit, when in reality, there were two.  Because entry and exit
1702pages are calculated in conjunction with visits, this situation
1703would also only record 1 entry and 1 exit page, when in reality,
1704there should be 2.
1705
1706As another example, say a single user at XYZ company is surfing
1707around your website..  They arrive at 11:52pm the last day of
1708the month, and continue surfing until 12:30am, which is now a
1709new day (in a new month).  Since a common practice is to rotate
1710(save then clear) the server logs at the end of the month, you
1711now have the users visit logged in two different files (current
1712and previous months).  Because of this (and the fact that the
1713Webalizer clears history between months), the first page the
1714user requests after midnight will be counted as an entry page.
1715This is unavoidable, since it is the first request seen by that
1716particular IP address in the new month.
1717
1718For the most part, the numbers shown for visits, entry and exit
1719pages are pretty good 'guesses', even though they may not be 100%
1720accurate.  They do provide a good indication of overall trends,
1721and shouldn't be that far off from the real numbers to count much.
1722You should probably consider them as the 'minimum' amount possible,
1723since the actual (real) values should always be equal or greater
1724in all cases.
1725
1726
1727Exporting Webalizer Data
1728------------------------
1729
1730The Webalizer now has the ability to dump all object tables to tab
1731delimited ASCII text files, which can then be imported into most
1732popular database and spreadsheet programs. The files are not normally
1733produced, as on some sites they could become quite large, and are only
1734enabled by the use of the Dump* configuration keywords.  The filename
1735extensions default to '.tab' however may be changed using the
1736'DumpExtension' keyword.  Since this data contains all items, even
1737those normally hidden, it may not be desirable to have them located
1738in the output directory where they may be visible to normal web users..
1739For this reason, the 'DumpPath' configuration keyword is available,
1740and allows the placement of these files somewhere outside the normal
1741web server document tree.  An optional 'header' record may be written
1742to these files as well, and is useful when the data is to be imported
1743into a spreadsheet.. databases will not normally need the header.  If
1744enabled, the header is simply the column names as the first record of
1745the file, tab separated.
1746
1747
1748Log files and The Webalizer
1749---------------------------
1750
1751Most sites will choose to have The Webalizer run from cron at specified
1752intervals.  Care should be taken to ensure that data is not lost as a
1753result of log file rotations.  A suggested practice is to rotate your
1754web server logs at the end of each month as close to midnight as possible,
1755then have The Webalizer process the 'end of month' log file before running
1756statistics on the new, current log.  On our systems, a shell script called
1757'rotate_logs' is run at midnight, the end of each month.  This script file
1758looks like:
1759
1760------------------------- file: rotate_logs ------------------------------
1761#!/bin/sh
1762
1763# halt the server
1764kill `cat /var/lib/httpd/logs/httpd.pid`
1765
1766# define backup names
1767OLD_ACCESS_LOG=/var/lib/httpd/logs/old/access_log.`date +%y%m%d-%H%M%S`
1768OLD_ERROR_LOG=/var/lib/httpd/logs/old/error_log.`date +%y%m%d-%H%M%S`
1769
1770# make end of month copy for analyzer
1771cp /var/lib/httpd/logs/access_log /var/lib/httpd/logs/access_log.backup
1772
1773# move files to archive directory
1774mv /var/lib/httpd/logs/access_log `echo $OLD_ACCESS_LOG`
1775mv /var/lib/httpd/logs/error_log  `echo $OLD_ERROR_LOG`
1776
1777# restart web server
1778/usr/sbin/httpd
1779
1780# compress the archived files
1781/bin/gzip $OLD_ACCESS_LOG
1782/bin/gzip $OLD_ERROR_LOG
1783------------------------- end of file ------------------------------------
1784
1785This script first stops the web server using a 'kill' command.  Apache
1786keeps the PID of the server in the file httpd.pid, so we use it as the
1787argument for the kill.  Next, it defines some names for the backup files,
1788which are basically the name of the files with the date and time appended
1789to the end of them.  It then makes a copy of the log file, appended with
1790'.backup' in the log directory, moves the current log files to an archive
1791directory (/var/lib/httpd/logs/old) and restarts the server.  This setup
1792allows the web server to be down for the minimum amount of time needed,
1793which is important for busy sites.  If you don't want to stop the server,
1794you can remove the initial 'kill' command, and replace the '/usr/sbin/httpd'
1795line with "kill -1 `cat /var/lib/httpd/logs/httpd.pid`" command instead,
1796On most web servers, this will cause a restart of the server and create
1797the new log files in the process...
1798
1799At this point, we have made copies of the previous months logs,  the web
1800server is going about its business as usual, and we have all the time in
1801the world to do any other additional processing we want.  The last two
1802lines of the script compress the archived logs using the GNU zip program
1803(gzip).  Remember, we still have a copy of the log which we can now run
1804The Webalizer on without having to do any further processing.
1805
1806Next, we define two crontab entries.  The first runs the above 'rotate_logs'
1807script at midnight at the end of the month.  The second runs The Webalizer
1808on the '.backup' log file created above at 5 minutes after midnight.  This
1809gives other end of month processing jobs a chance to run so we don't bog
1810the system down too much.  If you have lots of end of month stuff going on,
1811you can change the timing to suit your needs.  The crontab entries look
1812something like:
1813
1814------------------------- crontab entries --------------------------------
1815# Rotate web server logs and run monthly analysis
18160 0 1 * *       /usr/local/adm/rotate_logs
18175 0 1 * *       /usr/bin/webalizer -Q /var/lib/httpd/logs/access_log.backup
1818------------------------- end of crontab ---------------------------------
1819
1820As you can see, the log rotations occur at midnight, and the analysis
1821is done at 5 minutes after.  Once you verify that The Webalizer ran
1822successfully, the access_log.backup file can be deleted as it isn't
1823needed any more.  If you need to re-run the analysis, you still have
1824the compressed archive copy that the shell script created.  In order
1825for the above analysis to work properly, you should have already
1826created an /etc/webalizer.conf configuration file suitable for your
1827site, or otherwise specify configuration options or a configuration
1828file on the crontab command line above.
1829
1830If you want The Webalizer to be run more often than once a month, you
1831can specify additional crontab entries to do this as well.  Care should
1832be taken however to ensure that The Webalizer is not running when the
1833end of month processing above occurs, or unpredictable results may
1834happen (such as an inability to rotate the logs due to a file lock).
1835The easiest way is to run it on the half hour with a crontab entry like:
1836
183730 * * * *      /usr/bin/webalizer
1838
1839
1840Reverse DNS Lookups
1841-------------------
1842
1843The Webalizer fully supports both IPv4 and IPv6 DNS lookups, and
1844maintains a cache of those lookups to reduce processing the same
1845addresses in subsequent runs.  The cache file can be created at
1846run-time, or may be created before running the webalizer using either
1847the stand alone 'webazolver' program, or The Webalizer (DNS) Cache
1848file Manager program 'wcmgr'.  In order to perform reverse lookups,
1849a DNS Cache file must be specified, either on the command line or in
1850a configuration file.  In order to create/update the cache file at
1851run-time, the number of DNS Children must also be specified, and can
1852be anything between 1 and 100.  This specifies the number of child
1853processes to be forked, each of which will perform network DNS
1854queries in order to lookup up the addresses and update the cache.
1855Cached entries that are older than a specified TTL (time to live)
1856will be expired, and if encountered again in a log, will be looked
1857up at that time in order to 'freshen' them (verify the name is still
1858the same and update its timestamp).  The default TTL is 7 days, however
1859may be set to anything between 1 and 100 days.  Using the 'wcmgr'
1860program, entries may also be marked as 'permanent', in which case
1861they will persist (with an infinite TTL) in the cache until manually
1862removed.  See the file DNS.README for additional information.
1863
1864
1865Geolocation Lookups
1866-------------------
1867
1868The Webalizer has the ability to perform geolocation lookups on IP
1869addresses using either it's own internal GeoDB database or optionally
1870the GeoIP database from MaxMind, Inc. (www.maxmind.com).  If used,
1871unresolved addresses will be searched for in the database and it's
1872country of origin will be returned if found.  This actually produces
1873more accurate Country information than DNS lookups, since the DNS
1874address space has additional gcTLDs that do not necessarily map to
1875a specific country (such as '.net' and '.com').  It is possible to
1876use both DNS lookups and geolocation lookups at the same time, which
1877will cause any addresses that could not be resolved using DNS lookups
1878to then be looked up in the database, greatly reducing the number of
1879'Unknown/Unresolved' entries in the generated reports.  The native
1880GeoDB geolocation database provided by The Webalizer fully supports
1881IPv4 and IPv6 lookups, is updated regularly, and is the preferred
1882geolocation method for use with The Webalizer.  The most current
1883version of the database can be obtained from our ftp site.
1884
1885
1886Language Support
1887----------------
1888
1889Version 1.0x of The Webalizer added language support.  This
1890support is only provided at compile time in the form of an
1891include file containing all the strings used by The Webalizer.
1892The source distribution contains all language files that were
1893available at the time, with English being the default as
1894that is the only human language I speak fluently, and me
1895Espanol es muy malo.  Several people have already indicated
1896the desire to do translations into various languages, and as
1897I receive the language files, will make them available via
1898ftp at ftp://ftp.mrunix.net/pub/webalizer/lang.  Unless there
1899happens to be a binary distribution in the language you need,
1900you will need to grab the source distribution and compile the
1901program yourself. See the file INSTALL that comes in the source
1902distribution for information on how to use a language other than
1903English.
1904
1905It should also be noted that the GD graphics library, used to
1906produce the in-line graphics in the output HTML,  doesn't
1907support extended character sets, so if you are translating
1908the language file, you will no doubt encounter this problem.
1909
1910New: You can now specify the language to use when you are building
1911     program from source, using the configure script.  Just add
1912     --with-language=language_name   , where 'language_name' is the
1913     name of a valid language file in the /lang/ directory.  For
1914     example, --with-language=french  will build using French as
1915     the default language.  You should consult the INSTALL file
1916     for additional information on building the program from source.
1917
1918
1919Known Issues
1920------------
1921
1922 o Memory Usage.  The Webalizer makes liberal use of memory for internal
1923    data structures during analysis.  Lack of real physical memory will
1924    noticeably degrade performance by doing lots of swapping between memory
1925    and disk.  One user who had a rather large log file noticed that The
1926    Webalizer took over 7 hours to run with only 16 Meg of memory.  Once
1927    memory was increased, the time was reduced to a few minutes.
1928
1929
1930 o Performance.  The Hide*, Group*, Ignore*, Include*  and IndexAlias
1931    configuration options can cause a performance decrease if lots of
1932    them are used.  The reason for this is that every log record must
1933    be scanned for each item in each list.  For example, if you are
1934    Hiding 20 objects, Grouping 20 more, and Ignoring 5,  each record
1935    is scanned, at most, 46 times (20+20+5 + an IndexAlias scan).
1936    On really large log files, this can have a profound impact.  It
1937    is recommended that you use the least amount of these configuration
1938    options that you can, as it will greatly improve performance.
1939
1940
1941Final Notes
1942-----------
1943
1944A lot of time and effort went into making The Webalizer, and to ensure that
1945the results are as accurate as possible.  If you find any abnormalities or
1946inconsistent results, bugs, errors, omissions or anything else that doesn't
1947look right, please let me know so I can investigate the problem or correct
1948the error.  This goes for the minimal documentation as well.  Suggestions
1949for future versions are also welcome and appreciated.
1950

README.FIRST

1Upgrade information for the Webalizer Version 2.2x
2
3This release is, for the most part, a drop-in replacement for all
4installations currently running 2.01, and all users are encouraged
5to upgrade.  See the 'CHANGES' file for a full list of changes
6since version 2.01-10.
7
8Note: The history file format has changed in v2.20 in order to keep
9more than 12 months.  Existing history files will be automatically
10converted to the new format the first time they are read.
11
12Note: This version redefines the '-v' command line switch to mean
13'verbose', which will cause the program to display additional
14informational and debugging messages at run-time.  This should not
15cause any major problems, as previously it would simply cause the
16program to display its version information and then exit.
17
18Report bugs to 'brad at mrunix dot net' with "Webalizer" somewhere
19in the subject. Please do not send HTML formatted e-mails or e-mail
20containing HTML tags as my mail server will reject them.  Thanks!
21
22