1=head1 NAME 2 3ovdb - Overview storage method for INN 4 5=head1 DESCRIPTION 6 7The ovdb overview is a storage method that uses the S<Berkeley DB> 8library to store overview data. It requires version 4.4 or later of 9the S<Berkeley DB> library (4.7+ is recommended because older versions 10suffer from various issues). 11 12The ovdb overview method makes use of the full 13transaction/logging/locking functionality of the 14S<Berkeley DB> environment. S<Berkeley DB> may be downloaded from 15L<http://www.oracle.com/technetwork/database/database-technologies/berkeleydb/overview/index.html> 16and is needed to build the ovdb backend. 17 18=head1 UPGRADING 19 20There are several versions of the ovdb storage method: 21 22=over 2 23 24=item * 25 26Version 1, the initial version shipped with S<INN 2.3.0> up to S<INN 2.3.5>. 27 28=item * 29 30Version 2, with improved performance, since S<INN 2.4.0>. 31 32=item * 33 34Version 3, corresponding to version 2 with compression enabled, starting 35with S<INN 2.5.0>. 36 37=back 38 39If you have a database created with a previous version of ovdb, 40your database will need to be upgraded using B<ovdb_init>. See the 41ovdb_init(8) man page for upgrade instructions, as well as the 42L<COMPRESSION> section below. 43 44Note that when the S<Berkeley DB> library is updated to a newer version, 45the ovdb database also needs being upgraded. 46 47=head1 INSTALLATION 48 49If the S<Berkeley DB> library is found at configure time, INN will be 50built with S<Berkeley DB> support unless the B<--without-bdb> flag is 51explicitly passed to configure. By default, configure will search for 52S<Berkeley DB> in standard locations; there will be a message in the 53configure output indicating the pathname that will be used. 54 55You can override this pathname by adding a path to the option, for 56instance B<--with-bdb=/usr/BerkeleyDB.4.4>. This directory 57is expected to have subdirectories F<include> and F<lib> (F<lib32> 58and F<lib64> are also checked), containing respectively F<db.h>, and 59the library itself. In case non-standard paths to the S<Berkeley DB> 60libraries are used, one or both of the options B<--with-bdb-include> 61and B<--with-bdb-lib> can be given to configure with a path. 62 63The ovdb database may take up more disk space for a given spool than the 64other overview methods. Plan on needing at least S<1.1 KB> for every 65article in your spool (not counting crossposts). So, if you have 5 66million articles, you'll need at least S<5.5 GB> of disk space for ovdb. 67With compression enabled, this estimate changes to S<0.9 KB> per article, 68so you'll need at least S<4.5 GB> of disk space for 5 million articles. 69See the L<COMPRESSION> section below. Plus, you'll need additional space 70for transaction logs: at least S<100 MB>. By default, the transaction 71logs go in the same directory as the database. To improve performance, 72they can be placed on a different disk S<-- see> the L<DB_CONFIG> 73section. 74 75=head1 CONFIGURATION 76 77To enable the ovdb overview method, set the I<ovmethod> parameter in 78F<inn.conf> to C<ovdb>. The ovdb database is stored in the directory 79specified by the I<pathoverview> parameter in F<inn.conf>. This is 80the C<DB_HOME> directory. To start out, this directory should be empty 81(other than an optional F<DB_CONFIG> file; see L<DB_CONFIG> for details), 82and B<innd> (or B<makehistory>) will create the files as necessary in 83that directory. Also, make sure the directory is owned by the news user. 84 85Other parameters for configuring ovdb are in the F<ovdb.conf> 86configuration file. The following parameters can be set in that file: 87 88=over 4 89 90=item I<compress> 91 92If INN was compiled with zlib, and this I<compress> parameter is true, 93ovdb will compress overview records that are longer than 600 bytes. 94See the L<COMPRESSION> section below. 95 96=item I<cachesize> 97 98Size of the memory pool cache, in kilobytes. The cache will have a 99backing store file in the DB directory which will be at least as big. 100In general, the bigger the cache, the better. Use C<ovdb_stat -m> 101to see cache hit percentages. To make a change of this parameter take 102effect, shut down and restart INN (be sure to kill all of the B<nnrpd> 103processes when shutting down). Default is C<8000> (KB), which is adequate 104for small to medium-sized servers. Large servers will probably need 105at least C<20000> (KB). 106 107=item I<ncache> 108 109Number of regions across which to split the cache. The region size 110is equal to I<cachesize> divided by I<ncache>. Default is C<1> for 111I<ncache>, that is to say the cache will be allocated contiguously 112in memory. 113 114=item I<numdbfiles> 115 116Overview data is split between this many files. Currently, B<innd> will 117keep all of the files open, so don't set this too high or B<innd> may run 118out of file descriptors. B<nnrpd> only opens one at a time, regardless. 119May be set to one, or just a few, but only do that if your OS supports 120large (S<< > 2 GB >>) files. Changing this parameter has no effect on an 121already-established database. Default is C<32>. 122 123=item I<txn_nosync> 124 125If txn_nosync is set to false, S<Berkeley DB> flushes the log after every 126transaction. This minimizes the number of transactions that may be lost 127in the event of a crash, but results in significantly degraded 128performance. Default is true. 129 130=item I<useshm> 131 132If I<useshm> is set to true, S<Berkeley DB> will use shared memory instead of 133mmap for its environment regions (cache, lock, etc). With some platforms, 134this may improve performance. Default is false. 135 136=item I<shmkey> 137 138Sets the shared memory key used by S<Berkeley DB> when I<useshm> is true. 139S<Berkeley DB> will create several (usually 5) shared memory segments, using 140sequentially numbered keys starting with C<shmkey>. Choose a key that does 141not conflict with any existing shared memory segments on your system. 142Default is C<6400>. 143 144=item I<pagesize> 145 146Sets the page size for the DB files (in bytes). Must be a power 147of 2. Best choices are C<4096> or C<8192>. The default is C<8192>. 148Changing this parameter has no effect on an already-established database. 149 150=item I<minkey> 151 152Sets the minimum number of keys per page. See the S<Berkeley DB> 153documentation for more information. Default is based on page size 154and whether compression is enabled: 155 156 default_minkey = MAX(2, pagesize / 2600) if compress is false 157 default_minkey = MAX(2, pagesize / 1500) if compress is true 158 159The lowest allowed I<minkey> is C<2>. Setting I<minkey> higher than 160the default is not recommended, as it will cause the databases to have 161a lot of overflow pages. Changing this parameter has no effect on an 162already-established database. 163 164=item I<maxlocks> 165 166Sets the S<Berkeley DB> I<lk_max> parameter, which is the maximum number of 167locks that can exist in the database at the same time. Default is C<4000>. 168 169=item I<nocompact> 170 171The I<nocompact> parameter affects the behaviour of B<expireover>. 172The B<expireover> function in ovdb can do its job in one of two 173ways: by simply deleting expired records from the database; or by 174re-writing the overview records into a different location leaving out 175the expired records. The first method is faster, but it leaves 'holes' 176that result in space that can not immediately be reused. The second 177method 'compacts' the records by rewriting them. 178 179If this parameter is set to C<0>, B<expireover> will compact all 180newsgroups; if set to C<1>, B<expireover> will not compact any 181newsgroups; and if set to a value greater than one, B<expireover> 182will only compact groups that have less than that number of articles. 183 184Experience has shown that compacting has minimal effect (other than 185making B<expireover> take longer) so the default is C<1>. This parameter 186will probably be removed in the future. 187 188=item I<readserver> 189 190When the I<readserver> parameter is set to false, each B<nnrpd> 191process directly accesses the S<Berkeley DB> environment. The process 192of attaching to the database (and detaching when finished) is fairly 193expensive, and can result in high loads in situations when there are 194lots of reader connections of relatively short duration. 195 196When the I<readserver> parameter is set to true, the B<nnrpd> processes 197will access overview via a helper server (B<ovdb_server> S<-- which> 198is started by B<ovdb_init>). All ovdb reads will then be funnelled 199through a single process with a cleaner interface to the underlying 200S<Berkeley DB> database. This will result in cleaner shutdowns for the 201database, improving stability and avoiding deadlocks, timing issues and 202corrupted databases. That's why you should try to set this parameter to 203true if you are experiencing any instability in the ovdb overview method. 204 205Default value is true. 206 207=item I<numrsprocs> 208 209This parameter is only used when I<readserver> is true. It sets the 210number of B<ovdb_server> processes. As each B<ovdb_server> can process 211only one transaction at a time, running more servers can improve reader 212response times. Default is C<5>. 213 214=item I<maxrsconn> 215 216This parameter is only used when I<readserver> is true. It sets a 217maximum number of readers that a given B<ovdb_server> process will 218serve at one time. This means the maximum number of readers for all 219of the B<ovdb_server> processes is (I<numrsprocs> * I<maxrsconn>). 220This does I<not> limit the actual number of readers, since B<nnrpd> 221will fall back to opening the database directly if it can't connect to 222an B<ovdb_server>. Default is C<0>, which means an unlimited number 223of connections is allowed. 224 225=back 226 227=head1 COMPRESSION 228 229The ovdb storage method has the ability to compress overview data 230before it is stored into the database. In addition to consuming less disk 231space, compression keeps the average size of the database keys smaller. 232This in turn increases the average number of keys per page, which can 233significantly improve performance and also helps keep the database more 234compact. This feature requires that INN be built with zlib. Only records 235larger than 600 bytes get compressed, because that is the point at which 236compression starts to become significant. 237 238If compression is not enabled (either from the I<compress> option in 239F<ovdb.conf> or INN was not built with zlib support), the database 240will be backward compatible with older versions of ovdb. However, 241if compression is enabled, the database is marked with a newer version 242that will prevent older versions of ovdb from opening the database. 243 244You can upgrade an existing database to use compression simply by setting 245I<compress> to true in F<ovdb.conf>. Note that existing records in the 246database will remain uncompressed; only new records added after enabling 247compression will be compressed. 248 249If you disable compression on a database that previously had it enabled, 250new records will be stored uncompressed, but the database will still be 251incompatible with older versions of ovdb (and will also be incompatible 252with this version of ovdb if INN was not built with zlib support). 253So to downgrade to a completely uncompressed database, you will have 254to rebuild the database using B<makehistory>. 255 256=head1 DB_CONFIG 257 258A file called F<DB_CONFIG> may be placed in the database directory 259(I<pathoverview> in F<inn.conf>) to customize where the various database 260files and transaction logs are written. By default, all of the files 261are written in the C<DB_HOME> directory. One way to improve performance 262is to put the transaction logs on a different disk. To do this, put: 263 264 DB_LOG_DIR /path/to/logs 265 266in the F<DB_CONFIG> file. If the pathname you give starts with a C</>, it is 267treated as an absolute path; otherwise, it is relative to the C<DB_HOME> 268directory. Make sure that any directories you specify exist and have 269proper ownership/mode before starting INN, because they won't be created 270automatically. Also, don't change the F<DB_CONFIG> file while anything that 271uses ovdb is running. 272 273Another thing that you can do with this file is to split the overview 274database across multiple disks. In the F<DB_CONFIG> file, you can list 275directories that S<Berkeley DB> will search when it goes to open a database. 276 277For example, let's say that you have I<pathoverview> set to 278F</mnt/overview> and you have four additional file systems created 279on F</mnt/ovX>. You would create a file F</mnt/overview/DB_CONFIG> 280containing the following lines: 281 282 set_data_dir /mnt/overview 283 set_data_dir /mnt/ov1 284 set_data_dir /mnt/ov2 285 set_data_dir /mnt/ov3 286 set_data_dir /mnt/ov4 287 288Distribute your F<ovNNNNN> files into the four filesystems (say, 8 each). 289When called upon to open a database file, the db library will look for it 290in each of the specified directories (in order). If said file is not 291found, one will be created in the first of those directories. 292 293Whenever you change F<DB_CONFIG> or move database files around, make 294sure all news processes that use the database are shut down first 295(including B<nnrpd> processes). 296 297The F<DB_CONFIG> functionality is part of S<Berkeley DB> itself, 298rather than something provided by ovdb. See the S<Berkeley DB> 299documentation for complete details for the version of S<Berkeley DB> 300that you're running. 301 302=head1 RUNNING 303 304When starting the news system, B<rc.news> will invoke the B<ovdb_init> 305program. See the ovdb_init(8) man page for information about the tasks 306it performs. B<ovdb_init> must be run before using the database. 307 308And when stopping INN, B<rc.news> kills the B<ovdb_monitor> processes after 309the other INN processes have been shut down. 310 311=head1 DIAGNOSTICS 312 313Problems relating to ovdb are logged to F<news.err> with C<OVDB> in 314the error message. 315 316INN programs that use overview will fail to start up if the 317B<ovdb_monitor> processes aren't running. Be sure to run B<ovdb_init> 318before running anything that accesses overview. 319 320Also, INN programs that use overview will fail to start up if the user 321running them is not the news user. 322 323If a program accessing the database crashes, or otherwise exits uncleanly, 324it might leave a stale lock in the database. This lock could cause other 325processes to deadlock on that stale lock. To fix this, shut down all news 326processes (using C<kill -9> if necessary) and then restart. B<ovdb_init> 327should perform a recovery operation which will remove the locks and repair 328damage caused by killing the deadlocked processes. 329 330=head1 FILES 331 332=over 4 333 334=item I<pathetc>/inn.conf 335 336The I<ovmethod> and I<pathoverview> parameters are relevant to ovdb. 337 338=item I<pathetc>/ovdb.conf 339 340Optional configuration file for tuning. See L<CONFIGURATION> above. 341 342=item I<pathoverview> 343 344Directory where the database goes. S<Berkeley DB> calls it the 345C<DB_HOME> directory. 346 347=item I<pathoverview>/DB_CONFIG 348 349Optional file to configure the layout of the database files. 350 351=item I<pathrun>/ovdb.sem 352 353A file that gets locked by every process that is accessing the database. 354This is used by B<ovdb_init> to determine whether the database is active 355or quiescent. 356 357=item I<pathrun>/ovdb_monitor.pid 358 359Contains the process ID of B<ovdb_monitor>. 360 361=back 362 363=head1 TO DO 364 365Implement a way to limit how many databases can be open at once (to reduce 366file descriptor usage); maybe using something similar to the cache code in 367legacy F<ov3.c> file. 368 369=head1 HISTORY 370 371Written by Heath Kehoe <hakehoe@avalon.net> for InterNetNews. 372 373$Id: ovdb.pod 10525 2021-01-20 11:51:15Z iulius $ 374 375=head1 SEE ALSO 376 377inn.conf(5), innd(8), makehistory(8), nnrpd(8), ovdb_init(8), 378ovdb_monitor(8), ovdb_stat(8). 379 380S<Berkeley DB> documentation: in the F<docs> directory of the S<Berkeley 381DB> source distribution, or on the Oracle S<Berkeley DB> web page 382(L<http://www.oracle.com/technetwork/database/database-technologies/berkeleydb/overview/index.html>). 383 384=cut 385