1=head1 NAME 2 3BerkeleyDB - Perl extension for Berkeley DB version 2, 3, 4, 5 or 6 4 5=head1 SYNOPSIS 6 7 use BerkeleyDB; 8 9 $env = new BerkeleyDB::Env [OPTIONS] ; 10 11 $db = tie %hash, 'BerkeleyDB::Hash', [OPTIONS] ; 12 $db = new BerkeleyDB::Hash [OPTIONS] ; 13 14 $db = tie %hash, 'BerkeleyDB::Btree', [OPTIONS] ; 15 $db = new BerkeleyDB::Btree [OPTIONS] ; 16 17 $db = tie @array, 'BerkeleyDB::Recno', [OPTIONS] ; 18 $db = new BerkeleyDB::Recno [OPTIONS] ; 19 20 $db = tie @array, 'BerkeleyDB::Queue', [OPTIONS] ; 21 $db = new BerkeleyDB::Queue [OPTIONS] ; 22 23 $db = new BerkeleyDB::Heap [OPTIONS] ; 24 25 $db = new BerkeleyDB::Unknown [OPTIONS] ; 26 27 $status = BerkeleyDB::db_remove [OPTIONS] 28 $status = BerkeleyDB::db_rename [OPTIONS] 29 $status = BerkeleyDB::db_verify [OPTIONS] 30 31 $hash{$key} = $value ; 32 $value = $hash{$key} ; 33 each %hash ; 34 keys %hash ; 35 values %hash ; 36 37 $env = $db->Env() 38 $status = $db->db_get() 39 $status = $db->db_exists() ; 40 $status = $db->db_put() ; 41 $status = $db->db_del() ; 42 $status = $db->db_sync() ; 43 $status = $db->db_close() ; 44 $status = $db->db_pget() 45 $hash_ref = $db->db_stat() ; 46 $status = $db->db_key_range(); 47 $type = $db->type() ; 48 $status = $db->status() ; 49 $boolean = $db->byteswapped() ; 50 $status = $db->truncate($count) ; 51 $status = $db->compact($start, $stop, $c_data, $flags, $end); 52 $status = $db->get_blob_threshold($t1) ; 53 $status = $db->get_blob_dir($dir) ; 54 55 $bool = $env->cds_enabled(); 56 $bool = $db->cds_enabled(); 57 $lock = $db->cds_lock(); 58 $lock->cds_unlock(); 59 60 ($flag, $old_offset, $old_length) = $db->partial_set($offset, $length) ; 61 ($flag, $old_offset, $old_length) = $db->partial_clear() ; 62 63 $cursor = $db->db_cursor([$flags]) ; 64 $newcursor = $cursor->c_dup([$flags]); 65 $status = $cursor->c_get() ; 66 $status = $cursor->c_put() ; 67 $status = $cursor->c_del() ; 68 $status = $cursor->c_count() ; 69 $status = $cursor->c_pget() ; 70 $status = $cursor->status() ; 71 $status = $cursor->c_close() ; 72 $stream = $cursor->db_stream() ; 73 74 $cursor = $db->db_join() ; 75 $status = $cursor->c_get() ; 76 $status = $cursor->c_close() ; 77 78 $status = $stream->size($S); 79 $status = $stream->read($data, $offset, $size); 80 $status = $stream->write($data, $offset); 81 82 $status = $env->txn_checkpoint() 83 $hash_ref = $env->txn_stat() 84 $status = $env->set_mutexlocks() 85 $status = $env->set_flags() 86 $status = $env->set_timeout() 87 $status = $env->lock_detect() 88 $status = $env->lsn_reset() 89 $status = $env->get_blob_threshold($t1) ; 90 $status = $env->get_blob_dir($dir) ; 91 92 $txn = $env->txn_begin() ; 93 $db->Txn($txn); 94 $txn->Txn($db1, $db2,...); 95 $status = $txn->txn_prepare() 96 $status = $txn->txn_commit() 97 $status = $txn->txn_abort() 98 $status = $txn->txn_id() 99 $status = $txn->txn_discard() 100 $status = $txn->set_timeout() 101 102 $status = $env->set_lg_dir(); 103 $status = $env->set_lg_bsize(); 104 $status = $env->set_lg_max(); 105 106 $status = $env->set_data_dir() ; 107 $status = $env->set_tmp_dir() ; 108 $status = $env->set_verbose() ; 109 $db_env_ptr = $env->DB_ENV() ; 110 111 $BerkeleyDB::Error 112 $BerkeleyDB::db_version 113 114 # DBM Filters 115 $old_filter = $db->filter_store_key ( sub { ... } ) ; 116 $old_filter = $db->filter_store_value( sub { ... } ) ; 117 $old_filter = $db->filter_fetch_key ( sub { ... } ) ; 118 $old_filter = $db->filter_fetch_value( sub { ... } ) ; 119 120 # deprecated, but supported 121 $txn_mgr = $env->TxnMgr(); 122 $status = $txn_mgr->txn_checkpoint() 123 $hash_ref = $txn_mgr->txn_stat() 124 $txn = $txn_mgr->txn_begin() ; 125 126=head1 DESCRIPTION 127 128B<NOTE: This document is still under construction. Expect it to be 129incomplete in places.> 130 131This Perl module provides an interface to most of the functionality 132available in Berkeley DB versions 2, 3, 5 and 6. In general it is safe to assume 133that the interface provided here to be identical to the Berkeley DB 134interface. The main changes have been to make the Berkeley DB API work 135in a Perl way. Note that if you are using Berkeley DB 2.x, the new 136features available in Berkeley DB 3.x or later are not available via 137this module. 138 139The reader is expected to be familiar with the Berkeley DB 140documentation. Where the interface provided here is identical to the 141Berkeley DB library and the... TODO 142 143The B<db_appinit>, B<db_cursor>, B<db_open> and B<db_txn> man pages are 144particularly relevant. 145 146The interface to Berkeley DB is implemented with a number of Perl 147classes. 148 149=head1 The BerkeleyDB::Env Class 150 151The B<BerkeleyDB::Env> class provides an interface to the Berkeley DB 152function B<db_appinit> in Berkeley DB 2.x or B<db_env_create> and 153B<DBENV-E<gt>open> in Berkeley DB 3.x (or later). Its purpose is to initialise a 154number of sub-systems that can then be used in a consistent way in all 155the databases you make use of in the environment. 156 157If you don't intend using transactions, locking or logging, then you 158shouldn't need to make use of B<BerkeleyDB::Env>. 159 160Note that an environment consists of a number of files that Berkeley DB 161manages behind the scenes for you. When you first use an environment, it 162needs to be explicitly created. This is done by including C<DB_CREATE> 163with the C<Flags> parameter, described below. 164 165=head2 Synopsis 166 167 $env = new BerkeleyDB::Env 168 [ -Home => $path, ] 169 [ -Server => $name, ] 170 [ -CacheSize => $number, ] 171 [ -Config => { name => value, name => value }, ] 172 [ -ErrFile => filename, ] 173 [ -MsgFile => filename, ] 174 [ -ErrPrefix => "string", ] 175 [ -Flags => number, ] 176 [ -SetFlags => bitmask, ] 177 [ -LockDetect => number, ] 178 [ -TxMax => number, ] 179 [ -LogConfig => number, ] 180 [ -MaxLockers => number, ] 181 [ -MaxLocks => number, ] 182 [ -MaxObjects => number, ] 183 [ -SharedMemKey => number, ] 184 [ -Verbose => boolean, ] 185 [ -BlobThreshold=> $number, ] 186 [ -BlobDir => directory, ] 187 [ -Encrypt => { Password => "string", 188 Flags => number }, ] 189 190All the parameters to the BerkeleyDB::Env constructor are optional. 191 192=over 5 193 194=item -Home 195 196If present, this parameter should point to an existing directory. Any 197files that I<aren't> specified with an absolute path in the sub-systems 198that are initialised by the BerkeleyDB::Env class will be assumed to 199live in the B<Home> directory. 200 201For example, in the code fragment below the database "fred.db" will be 202opened in the directory "/home/databases" because it was specified as a 203relative path, but "joe.db" will be opened in "/other" because it was 204part of an absolute path. 205 206 $env = new BerkeleyDB::Env 207 -Home => "/home/databases" 208 ... 209 210 $db1 = new BerkeleyDB::Hash 211 -Filename => "fred.db", 212 -Env => $env 213 ... 214 215 $db2 = new BerkeleyDB::Hash 216 -Filename => "/other/joe.db", 217 -Env => $env 218 ... 219 220=item -Server 221 222If present, this parameter should be the hostname of a server that is running 223the Berkeley DB RPC server. All databases will be accessed via the RPC server. 224 225=item -Encrypt 226 227If present, this parameter will enable encryption of all data before 228it is written to the database. This parameters must be given a hash 229reference. The format is shown below. 230 231 -Encrypt => { -Password => "abc", Flags => DB_ENCRYPT_AES } 232 233Valid values for the Flags are 0 or C<DB_ENCRYPT_AES>. 234 235This option requires Berkeley DB 4.1 or better. 236 237=item -Cachesize 238 239If present, this parameter sets the size of the environments shared memory 240buffer pool. 241 242=item -TxMax 243 244If present, this parameter sets the number of simultaneous 245transactions that are allowed. Default 100. This default is 246definitely too low for programs using the MVCC capabilities. 247 248=item -LogConfig 249 250If present, this parameter is used to configure log options. 251 252=item -MaxLockers 253 254If present, this parameter is used to configure the maximum number of 255processes doing locking on the database. Default 1000. 256 257=item -MaxLocks 258 259If present, this parameter is used to configure the maximum number of 260locks on the database. Default 1000. This is often lower than required. 261 262=item -MaxObjects 263 264If present, this parameter is used to configure the maximum number of 265locked objects. Default 1000. This is often lower than required. 266 267=item -SharedMemKey 268 269If present, this parameter sets the base segment ID for the shared memory 270region used by Berkeley DB. 271 272This option requires Berkeley DB 3.1 or better. 273 274Use C<$env-E<gt>get_shm_key($id)> to find out the base segment ID used 275once the environment is open. 276 277=item -ThreadCount 278 279If present, this parameter declares the approximate number of threads that 280will be used in the database environment. This parameter is only necessary 281when the $env->failchk method will be used. It does not actually set the 282maximum number of threads but rather is used to determine memory sizing. 283 284This option requires Berkeley DB 4.4 or better. It is only supported on 285Unix/Linux. 286 287=item -BlobThreshold 288 289Sets the size threshold that will be used to decide when data is stored as 290a BLOB. This option must be set for a blobs to be used. 291 292This option requires Berkeley DB 6.0 or better. 293 294=item -BlobDir 295 296The directory where the BLOB objects are stored. 297 298If not specified blob files are stores in the environment directoy. 299 300 301This option requires Berkeley DB 6.0 or better. 302 303=item -Config 304 305This is a variation on the C<-Home> parameter, but it allows finer 306control of where specific types of files will be stored. 307 308The parameter expects a reference to a hash. Valid keys are: 309B<DB_DATA_DIR>, B<DB_LOG_DIR> and B<DB_TMP_DIR> 310 311The code below shows an example of how it can be used. 312 313 $env = new BerkeleyDB::Env 314 -Config => { DB_DATA_DIR => "/home/databases", 315 DB_LOG_DIR => "/home/logs", 316 DB_TMP_DIR => "/home/tmp" 317 } 318 ... 319 320=item -ErrFile 321 322Expects a filename or filehandle. Any errors generated internally by 323Berkeley DB will be logged to this file. A useful debug setting is to 324open environments with either 325 326 -ErrFile => *STDOUT 327 328or 329 330 -ErrFile => *STDERR 331 332=item -ErrPrefix 333 334Allows a prefix to be added to the error messages before they are sent 335to B<-ErrFile>. 336 337=item -Flags 338 339The B<Flags> parameter specifies both which sub-systems to initialise, 340as well as a number of environment-wide options. 341See the Berkeley DB documentation for more details of these options. 342 343Any of the following can be specified by OR'ing them: 344 345B<DB_CREATE> 346 347If any of the files specified do not already exist, create them. 348 349B<DB_INIT_CDB> 350 351Initialise the Concurrent Access Methods 352 353B<DB_INIT_LOCK> 354 355Initialise the Locking sub-system. 356 357B<DB_INIT_LOG> 358 359Initialise the Logging sub-system. 360 361B<DB_INIT_MPOOL> 362 363Initialize the shared memory buffer pool subsystem. This subsystem should be used whenever an application is using any Berkeley DB access method. 364 365B<DB_INIT_TXN> 366 367Initialize the transaction subsystem. This subsystem should be used when recovery and atomicity of multiple operations are important. The DB_INIT_TXN flag implies the DB_INIT_LOG flag. 368 369 370B<DB_MPOOL_PRIVATE> 371 372Create a private memory pool; see memp_open. Ignored unless DB_INIT_MPOOL is also specified. 373 374 375B<DB_INIT_MPOOL> is also specified. 376 377 378B<DB_NOMMAP> 379 380Do not map this database into process memory. 381 382 383B<DB_RECOVER> 384 385Run normal recovery on this environment before opening it for normal use. If this flag is set, the DB_CREATE flag must also be set since the regions will be removed and recreated. 386 387The db_appinit function returns successfully if DB_RECOVER is specified and no log files exist, so it is necessary to ensure all necessary log files are present before running recovery. 388 389 390B<DB_PRIVATE> 391 392B<DB_RECOVER_FATAL> 393 394Run catastrophic recovery on this environment before opening it for normal use. If this flag is set, the DB_CREATE flag must also be set since the regions will be removed and recreated. 395 396The db_appinit function returns successfully if DB_RECOVER_FATAL is specified and no log files exist, so it is necessary to ensure all necessary log files are present before running recovery. 397 398B<DB_THREAD> 399 400Ensure that handles returned by the Berkeley DB subsystems are useable by multiple threads within a single process, i.e., that the system is free-threaded. 401 402B<DB_TXN_NOSYNC> 403 404On transaction commit, do not synchronously flush the log; see txn_open. Ignored unless DB_INIT_TXN is also specified. 405 406B<DB_USE_ENVIRON> 407 408The Berkeley DB process' environment may be permitted to specify information to be used when naming files; see Berkeley DB File Naming. As permitting users to specify which files are used can create security problems, environment information will be used in file naming for all users only if the DB_USE_ENVIRON flag is set. 409 410B<DB_USE_ENVIRON_ROOT> 411 412The Berkeley DB process' environment may be permitted to specify information to be used when naming files; see Berkeley DB File Naming. As permitting users to specify which files are used can create security problems, if the DB_USE_ENVIRON_ROOT flag is set, environment information will be used for file naming only for users with a user-ID matching that of the superuser (specifically, users for whom the getuid(2) system call returns the user-ID 0). 413 414=item -SetFlags 415 416Calls ENV->set_flags with the supplied bitmask. Use this when you need to make 417use of DB_ENV->set_flags before DB_ENV->open is called. 418 419Only valid when Berkeley DB 3.x or better is used. 420 421=item -LockDetect 422 423Specifies what to do when a lock conflict occurs. The value should be one of 424 425B<DB_LOCK_DEFAULT> 426 427Use the default policy as specified by db_deadlock. 428 429B<DB_LOCK_OLDEST> 430 431Abort the oldest transaction. 432 433B<DB_LOCK_RANDOM> 434 435Abort a random transaction involved in the deadlock. 436 437B<DB_LOCK_YOUNGEST> 438 439Abort the youngest transaction. 440 441 442=item -Verbose 443 444Add extra debugging information to the messages sent to B<-ErrFile>. 445 446=back 447 448=head2 Methods 449 450The environment class has the following methods: 451 452=over 5 453 454=item $env->errPrefix("string") ; 455 456This method is identical to the B<-ErrPrefix> flag. It allows the 457error prefix string to be changed dynamically. 458 459=item $env->set_flags(bitmask, 1|0); 460 461=item $txn = $env->TxnMgr() 462 463Constructor for creating a B<TxnMgr> object. 464See L<"TRANSACTIONS"> for more details of using transactions. 465 466This method is deprecated. Access the transaction methods using the B<txn_> 467methods below from the environment object directly. 468 469=item $env->txn_begin() 470 471TODO 472 473=item $env->txn_stat() 474 475TODO 476 477=item $env->txn_checkpoint() 478 479TODO 480 481=item $env->status() 482 483Returns the status of the last BerkeleyDB::Env method. 484 485 486=item $env->DB_ENV() 487 488Returns a pointer to the underlying DB_ENV data structure that Berkeley 489DB uses. 490 491=item $env->get_shm_key($id) 492 493Writes the base segment ID for the shared memory region used by the 494Berkeley DB environment into C<$id>. Returns 0 on success. 495 496This option requires Berkeley DB 4.2 or better. 497 498Use the C<-SharedMemKey> option when opening the environmet to set the 499base segment ID. 500 501=item $env->set_isalive() 502 503Set the callback that determines if the thread of control, identified by 504the pid and tid arguments, is still running. This method should only be 505used in combination with $env->failchk. 506 507This option requires Berkeley DB 4.4 or better. 508 509=item $env->failchk($flags) 510 511The $env->failchk method checks for threads of control (either a true 512thread or a process) that have exited while manipulating Berkeley DB 513library data structures, while holding a logical database lock, or with an 514unresolved transaction (that is, a transaction that was never aborted or 515committed). 516 517If $env->failchk determines a thread of control exited while holding 518database read locks, it will release those locks. If $env->failchk 519determines a thread of control exited with an unresolved transaction, the 520transaction will be aborted. 521 522Applications calling the $env->failchk method must have already called the 523$env->set_isalive method, on the same DB environment, and must have 524configured their database environment using the -ThreadCount flag. The 525ThreadCount flag cannot be used on an environment that wasn't previously 526initialized with it. 527 528This option requires Berkeley DB 4.4 or better. 529 530=item $env->stat_print 531 532Prints statistical information. 533 534If the C<MsgFile> option is specified the output will be sent to the 535file. Otherwise output is sent to standard output. 536 537This option requires Berkeley DB 4.3 or better. 538 539=item $env->lock_stat_print 540 541Prints locking subsystem statistics. 542 543If the C<MsgFile> option is specified the output will be sent to the 544file. Otherwise output is sent to standard output. 545 546This option requires Berkeley DB 4.3 or better. 547 548=item $env->mutex_stat_print 549 550Prints mutex subsystem statistics. 551 552If the C<MsgFile> option is specified the output will be sent to the 553file. Otherwise output is sent to standard output. 554 555This option requires Berkeley DB 4.4 or better. 556 557=item $status = $env->get_blob_threshold($t1) ; 558 559Sets the parameter $t1 to the threshold value (in bytes) that is used to 560determine when a data item is stored as a Blob. 561 562=item $status = $env->get_blob_dir($dir) ; 563 564Sets the $dir parameter to the directory where blob files are stored. 565 566=item $env->set_timeout($timeout, $flags) 567 568=item $env->status() 569 570Returns the status of the last BerkeleyDB::Env method. 571 572=back 573 574=head2 Examples 575 576TODO. 577 578=head1 Global Classes 579 580 $status = BerkeleyDB::db_remove [OPTIONS] 581 $status = BerkeleyDB::db_rename [OPTIONS] 582 $status = BerkeleyDB::db_verify [OPTIONS] 583 584=head1 THE DATABASE CLASSES 585 586B<BerkeleyDB> supports the following database formats: 587 588=over 5 589 590=item B<BerkeleyDB::Hash> 591 592This database type allows arbitrary key/value pairs to be stored in data 593files. This is equivalent to the functionality provided by other 594hashing packages like DBM, NDBM, ODBM, GDBM, and SDBM. Remember though, 595the files created using B<BerkeleyDB::Hash> are not compatible with any 596of the other packages mentioned. 597 598A default hashing algorithm, which will be adequate for most applications, 599is built into BerkeleyDB. If you do need to use your own hashing algorithm 600it is possible to write your own in Perl and have B<BerkeleyDB> use 601it instead. 602 603=item B<BerkeleyDB::Btree> 604 605The Btree format allows arbitrary key/value pairs to be stored in a 606B+tree. 607 608As with the B<BerkeleyDB::Hash> format, it is possible to provide a 609user defined Perl routine to perform the comparison of keys. By default, 610though, the keys are stored in lexical order. 611 612=item B<BerkeleyDB::Recno> 613 614TODO. 615 616 617=item B<BerkeleyDB::Queue> 618 619TODO. 620 621=item B<BerkeleyDB::Heap> 622 623TODO. 624 625=item B<BerkeleyDB::Unknown> 626 627This isn't a database format at all. It is used when you want to open an 628existing Berkeley DB database without having to know what type is it. 629 630=back 631 632 633Each of the database formats described above is accessed via a 634corresponding B<BerkeleyDB> class. These will be described in turn in 635the next sections. 636 637=head1 BerkeleyDB::Hash 638 639Equivalent to calling B<db_open> with type B<DB_HASH> in Berkeley DB 2.x and 640calling B<db_create> followed by B<DB-E<gt>open> with type B<DB_HASH> in 641Berkeley DB 3.x or greater. 642 643Two forms of constructor are supported: 644 645 $db = new BerkeleyDB::Hash 646 [ -Filename => "filename", ] 647 [ -Subname => "sub-database name", ] 648 [ -Flags => flags,] 649 [ -Property => flags,] 650 [ -Mode => number,] 651 [ -Cachesize => number,] 652 [ -Lorder => number,] 653 [ -Pagesize => number,] 654 [ -Env => $env,] 655 [ -Txn => $txn,] 656 [ -Encrypt => { Password => "string", 657 Flags => number }, ], 658 [ -BlobThreshold=> $number, ] 659 [ -BlobDir => directory, ] 660 # BerkeleyDB::Hash specific 661 [ -Ffactor => number,] 662 [ -Nelem => number,] 663 [ -Hash => code reference,] 664 [ -DupCompare => code reference,] 665 666and this 667 668 [$db =] tie %hash, 'BerkeleyDB::Hash', 669 [ -Filename => "filename", ] 670 [ -Subname => "sub-database name", ] 671 [ -Flags => flags,] 672 [ -Property => flags,] 673 [ -Mode => number,] 674 [ -Cachesize => number,] 675 [ -Lorder => number,] 676 [ -Pagesize => number,] 677 [ -Env => $env,] 678 [ -Txn => $txn,] 679 [ -Encrypt => { Password => "string", 680 Flags => number }, ], 681 [ -BlobThreshold=> $number, ] 682 [ -BlobDir => directory, ] 683 # BerkeleyDB::Hash specific 684 [ -Ffactor => number,] 685 [ -Nelem => number,] 686 [ -Hash => code reference,] 687 [ -DupCompare => code reference,] 688 689 690When the "tie" interface is used, reading from and writing to the database 691is achieved via the tied hash. In this case the database operates like 692a Perl associative array that happens to be stored on disk. 693 694In addition to the high-level tied hash interface, it is possible to 695make use of the underlying methods provided by Berkeley DB 696 697=head2 Options 698 699In addition to the standard set of options (see L<COMMON OPTIONS>) 700B<BerkeleyDB::Hash> supports these options: 701 702=over 5 703 704=item -Property 705 706Used to specify extra flags when opening a database. The following 707flags may be specified by bitwise OR'ing together one or more of the 708following values: 709 710B<DB_DUP> 711 712When creating a new database, this flag enables the storing of duplicate 713keys in the database. If B<DB_DUPSORT> is not specified as well, the 714duplicates are stored in the order they are created in the database. 715 716B<DB_DUPSORT> 717 718Enables the sorting of duplicate keys in the database. Ignored if 719B<DB_DUP> isn't also specified. 720 721=item -Ffactor 722 723=item -Nelem 724 725See the Berkeley DB documentation for details of these options. 726 727=item -Hash 728 729Allows you to provide a user defined hash function. If not specified, 730a default hash function is used. Here is a template for a user-defined 731hash function 732 733 sub hash 734 { 735 my ($data) = shift ; 736 ... 737 # return the hash value for $data 738 return $hash ; 739 } 740 741 tie %h, "BerkeleyDB::Hash", 742 -Filename => $filename, 743 -Hash => \&hash, 744 ... 745 746See L<""> for an example. 747 748=item -DupCompare 749 750Used in conjunction with the B<DB_DUPOSRT> flag. 751 752 sub compare 753 { 754 my ($key, $key2) = @_ ; 755 ... 756 # return 0 if $key1 eq $key2 757 # -1 if $key1 lt $key2 758 # 1 if $key1 gt $key2 759 return (-1 , 0 or 1) ; 760 } 761 762 tie %h, "BerkeleyDB::Hash", 763 -Filename => $filename, 764 -Property => DB_DUP|DB_DUPSORT, 765 -DupCompare => \&compare, 766 ... 767 768=back 769 770 771=head2 Methods 772 773B<BerkeleyDB::Hash> only supports the standard database methods. 774See L<COMMON DATABASE METHODS>. 775 776=head2 A Simple Tied Hash Example 777 778 use strict ; 779 use BerkeleyDB ; 780 use vars qw( %h $k $v ) ; 781 782 my $filename = "fruit" ; 783 unlink $filename ; 784 tie %h, "BerkeleyDB::Hash", 785 -Filename => $filename, 786 -Flags => DB_CREATE 787 or die "Cannot open file $filename: $! $BerkeleyDB::Error\n" ; 788 789 # Add a few key/value pairs to the file 790 $h{"apple"} = "red" ; 791 $h{"orange"} = "orange" ; 792 $h{"banana"} = "yellow" ; 793 $h{"tomato"} = "red" ; 794 795 # Check for existence of a key 796 print "Banana Exists\n\n" if $h{"banana"} ; 797 798 # Delete a key/value pair. 799 delete $h{"apple"} ; 800 801 # print the contents of the file 802 while (($k, $v) = each %h) 803 { print "$k -> $v\n" } 804 805 untie %h ; 806 807here is the output: 808 809 Banana Exists 810 811 orange -> orange 812 tomato -> red 813 banana -> yellow 814 815Note that the like ordinary associative arrays, the order of the keys 816retrieved from a Hash database are in an apparently random order. 817 818=head2 Another Simple Hash Example 819 820Do the same as the previous example but not using tie. 821 822 use strict ; 823 use BerkeleyDB ; 824 825 my $filename = "fruit" ; 826 unlink $filename ; 827 my $db = new BerkeleyDB::Hash 828 -Filename => $filename, 829 -Flags => DB_CREATE 830 or die "Cannot open file $filename: $! $BerkeleyDB::Error\n" ; 831 832 # Add a few key/value pairs to the file 833 $db->db_put("apple", "red") ; 834 $db->db_put("orange", "orange") ; 835 $db->db_put("banana", "yellow") ; 836 $db->db_put("tomato", "red") ; 837 838 # Check for existence of a key 839 print "Banana Exists\n\n" if $db->db_get("banana", $v) == 0; 840 841 # Delete a key/value pair. 842 $db->db_del("apple") ; 843 844 # print the contents of the file 845 my ($k, $v) = ("", "") ; 846 my $cursor = $db->db_cursor() ; 847 while ($cursor->c_get($k, $v, DB_NEXT) == 0) 848 { print "$k -> $v\n" } 849 850 undef $cursor ; 851 undef $db ; 852 853=head2 Duplicate keys 854 855The code below is a variation on the examples above. This time the hash has 856been inverted. The key this time is colour and the value is the fruit name. 857The B<DB_DUP> flag has been specified to allow duplicates. 858 859 use strict ; 860 use BerkeleyDB ; 861 862 my $filename = "fruit" ; 863 unlink $filename ; 864 my $db = new BerkeleyDB::Hash 865 -Filename => $filename, 866 -Flags => DB_CREATE, 867 -Property => DB_DUP 868 or die "Cannot open file $filename: $! $BerkeleyDB::Error\n" ; 869 870 # Add a few key/value pairs to the file 871 $db->db_put("red", "apple") ; 872 $db->db_put("orange", "orange") ; 873 $db->db_put("green", "banana") ; 874 $db->db_put("yellow", "banana") ; 875 $db->db_put("red", "tomato") ; 876 $db->db_put("green", "apple") ; 877 878 # print the contents of the file 879 my ($k, $v) = ("", "") ; 880 my $cursor = $db->db_cursor() ; 881 while ($cursor->c_get($k, $v, DB_NEXT) == 0) 882 { print "$k -> $v\n" } 883 884 undef $cursor ; 885 undef $db ; 886 887here is the output: 888 889 orange -> orange 890 yellow -> banana 891 red -> apple 892 red -> tomato 893 green -> banana 894 green -> apple 895 896=head2 Sorting Duplicate Keys 897 898In the previous example, when there were duplicate keys, the values are 899sorted in the order they are stored in. The code below is 900identical to the previous example except the B<DB_DUPSORT> flag is 901specified. 902 903 use strict ; 904 use BerkeleyDB ; 905 906 my $filename = "fruit" ; 907 unlink $filename ; 908 my $db = new BerkeleyDB::Hash 909 -Filename => $filename, 910 -Flags => DB_CREATE, 911 -Property => DB_DUP | DB_DUPSORT 912 or die "Cannot open file $filename: $! $BerkeleyDB::Error\n" ; 913 914 # Add a few key/value pairs to the file 915 $db->db_put("red", "apple") ; 916 $db->db_put("orange", "orange") ; 917 $db->db_put("green", "banana") ; 918 $db->db_put("yellow", "banana") ; 919 $db->db_put("red", "tomato") ; 920 $db->db_put("green", "apple") ; 921 922 # print the contents of the file 923 my ($k, $v) = ("", "") ; 924 my $cursor = $db->db_cursor() ; 925 while ($cursor->c_get($k, $v, DB_NEXT) == 0) 926 { print "$k -> $v\n" } 927 928 undef $cursor ; 929 undef $db ; 930 931Notice that in the output below the duplicate values are sorted. 932 933 orange -> orange 934 yellow -> banana 935 red -> apple 936 red -> tomato 937 green -> apple 938 green -> banana 939 940=head2 Custom Sorting Duplicate Keys 941 942Another variation 943 944TODO 945 946=head2 Changing the hash 947 948TODO 949 950=head2 Using db_stat 951 952TODO 953 954=head1 BerkeleyDB::Btree 955 956Equivalent to calling B<db_open> with type B<DB_BTREE> in Berkeley DB 2.x and 957calling B<db_create> followed by B<DB-E<gt>open> with type B<DB_BTREE> in 958Berkeley DB 3.x or greater. 959 960Two forms of constructor are supported: 961 962 963 $db = new BerkeleyDB::Btree 964 [ -Filename => "filename", ] 965 [ -Subname => "sub-database name", ] 966 [ -Flags => flags,] 967 [ -Property => flags,] 968 [ -Mode => number,] 969 [ -Cachesize => number,] 970 [ -Lorder => number,] 971 [ -Pagesize => number,] 972 [ -Env => $env,] 973 [ -Txn => $txn,] 974 [ -Encrypt => { Password => "string", 975 Flags => number }, ], 976 [ -BlobThreshold=> $number, ] 977 [ -BlobDir => directory, ] 978 # BerkeleyDB::Btree specific 979 [ -Minkey => number,] 980 [ -Compare => code reference,] 981 [ -DupCompare => code reference,] 982 [ -Prefix => code reference,] 983 984and this 985 986 [$db =] tie %hash, 'BerkeleyDB::Btree', 987 [ -Filename => "filename", ] 988 [ -Subname => "sub-database name", ] 989 [ -Flags => flags,] 990 [ -Property => flags,] 991 [ -Mode => number,] 992 [ -Cachesize => number,] 993 [ -Lorder => number,] 994 [ -Pagesize => number,] 995 [ -Env => $env,] 996 [ -Txn => $txn,] 997 [ -Encrypt => { Password => "string", 998 Flags => number }, ], 999 [ -BlobThreshold=> $number, ] 1000 [ -BlobDir => directory, ] 1001 # BerkeleyDB::Btree specific 1002 [ -Minkey => number,] 1003 [ -Compare => code reference,] 1004 [ -DupCompare => code reference,] 1005 [ -Prefix => code reference,] 1006 1007=head2 Options 1008 1009In addition to the standard set of options (see L<COMMON OPTIONS>) 1010B<BerkeleyDB::Btree> supports these options: 1011 1012=over 5 1013 1014=item -Property 1015 1016Used to specify extra flags when opening a database. The following 1017flags may be specified by bitwise OR'ing together one or more of the 1018following values: 1019 1020B<DB_DUP> 1021 1022When creating a new database, this flag enables the storing of duplicate 1023keys in the database. If B<DB_DUPSORT> is not specified as well, the 1024duplicates are stored in the order they are created in the database. 1025 1026B<DB_DUPSORT> 1027 1028Enables the sorting of duplicate keys in the database. Ignored if 1029B<DB_DUP> isn't also specified. 1030 1031=item Minkey 1032 1033TODO 1034 1035=item Compare 1036 1037Allow you to override the default sort order used in the database. See 1038L<"Changing the sort order"> for an example. 1039 1040 sub compare 1041 { 1042 my ($key, $key2) = @_ ; 1043 ... 1044 # return 0 if $key1 eq $key2 1045 # -1 if $key1 lt $key2 1046 # 1 if $key1 gt $key2 1047 return (-1 , 0 or 1) ; 1048 } 1049 1050 tie %h, "BerkeleyDB::Hash", 1051 -Filename => $filename, 1052 -Compare => \&compare, 1053 ... 1054 1055=item Prefix 1056 1057 sub prefix 1058 { 1059 my ($key, $key2) = @_ ; 1060 ... 1061 # return number of bytes of $key2 which are 1062 # necessary to determine that it is greater than $key1 1063 return $bytes ; 1064 } 1065 1066 tie %h, "BerkeleyDB::Hash", 1067 -Filename => $filename, 1068 -Prefix => \&prefix, 1069 ... 1070=item DupCompare 1071 1072 sub compare 1073 { 1074 my ($key, $key2) = @_ ; 1075 ... 1076 # return 0 if $key1 eq $key2 1077 # -1 if $key1 lt $key2 1078 # 1 if $key1 gt $key2 1079 return (-1 , 0 or 1) ; 1080 } 1081 1082 tie %h, "BerkeleyDB::Hash", 1083 -Filename => $filename, 1084 -DupCompare => \&compare, 1085 ... 1086 1087=item set_bt_compress 1088 1089Enabled compression of the btree data. The callback interface is not 1090supported at present. Need Berkeley DB 4.8 or better. 1091 1092=back 1093 1094=head2 Methods 1095 1096B<BerkeleyDB::Btree> supports the following database methods. 1097See also L<COMMON DATABASE METHODS>. 1098 1099All the methods below return 0 to indicate success. 1100 1101=over 5 1102 1103=item $status = $db->db_key_range($key, $less, $equal, $greater [, $flags]) 1104 1105Given a key, C<$key>, this method returns the proportion of keys less than 1106C<$key> in C<$less>, the proportion equal to C<$key> in C<$equal> and the 1107proportion greater than C<$key> in C<$greater>. 1108 1109The proportion is returned as a double in the range 0.0 to 1.0. 1110 1111=back 1112 1113=head2 A Simple Btree Example 1114 1115The code below is a simple example of using a btree database. 1116 1117 use strict ; 1118 use BerkeleyDB ; 1119 1120 my $filename = "tree" ; 1121 unlink $filename ; 1122 my %h ; 1123 tie %h, 'BerkeleyDB::Btree', 1124 -Filename => $filename, 1125 -Flags => DB_CREATE 1126 or die "Cannot open $filename: $! $BerkeleyDB::Error\n" ; 1127 1128 # Add a key/value pair to the file 1129 $h{'Wall'} = 'Larry' ; 1130 $h{'Smith'} = 'John' ; 1131 $h{'mouse'} = 'mickey' ; 1132 $h{'duck'} = 'donald' ; 1133 1134 # Delete 1135 delete $h{"duck"} ; 1136 1137 # Cycle through the keys printing them in order. 1138 # Note it is not necessary to sort the keys as 1139 # the btree will have kept them in order automatically. 1140 foreach (keys %h) 1141 { print "$_\n" } 1142 1143 untie %h ; 1144 1145Here is the output from the code above. The keys have been sorted using 1146Berkeley DB's default sorting algorithm. 1147 1148 Smith 1149 Wall 1150 mouse 1151 1152 1153=head2 Changing the sort order 1154 1155It is possible to supply your own sorting algorithm if the one that Berkeley 1156DB used isn't suitable. The code below is identical to the previous example 1157except for the case insensitive compare function. 1158 1159 use strict ; 1160 use BerkeleyDB ; 1161 1162 my $filename = "tree" ; 1163 unlink $filename ; 1164 my %h ; 1165 tie %h, 'BerkeleyDB::Btree', 1166 -Filename => $filename, 1167 -Flags => DB_CREATE, 1168 -Compare => sub { lc $_[0] cmp lc $_[1] } 1169 or die "Cannot open $filename: $!\n" ; 1170 1171 # Add a key/value pair to the file 1172 $h{'Wall'} = 'Larry' ; 1173 $h{'Smith'} = 'John' ; 1174 $h{'mouse'} = 'mickey' ; 1175 $h{'duck'} = 'donald' ; 1176 1177 # Delete 1178 delete $h{"duck"} ; 1179 1180 # Cycle through the keys printing them in order. 1181 # Note it is not necessary to sort the keys as 1182 # the btree will have kept them in order automatically. 1183 foreach (keys %h) 1184 { print "$_\n" } 1185 1186 untie %h ; 1187 1188Here is the output from the code above. 1189 1190 mouse 1191 Smith 1192 Wall 1193 1194There are a few point to bear in mind if you want to change the 1195ordering in a BTREE database: 1196 1197=over 5 1198 1199=item 1. 1200 1201The new compare function must be specified when you create the database. 1202 1203=item 2. 1204 1205You cannot change the ordering once the database has been created. Thus 1206you must use the same compare function every time you access the 1207database. 1208 1209=back 1210 1211=head2 Using db_stat 1212 1213TODO 1214 1215=head1 BerkeleyDB::Recno 1216 1217Equivalent to calling B<db_open> with type B<DB_RECNO> in Berkeley DB 2.x and 1218calling B<db_create> followed by B<DB-E<gt>open> with type B<DB_RECNO> in 1219Berkeley DB 3.x or greater. 1220 1221Two forms of constructor are supported: 1222 1223 $db = new BerkeleyDB::Recno 1224 [ -Filename => "filename", ] 1225 [ -Subname => "sub-database name", ] 1226 [ -Flags => flags,] 1227 [ -Property => flags,] 1228 [ -Mode => number,] 1229 [ -Cachesize => number,] 1230 [ -Lorder => number,] 1231 [ -Pagesize => number,] 1232 [ -Env => $env,] 1233 [ -Txn => $txn,] 1234 [ -Encrypt => { Password => "string", 1235 Flags => number }, ], 1236 # BerkeleyDB::Recno specific 1237 [ -Delim => byte,] 1238 [ -Len => number,] 1239 [ -Pad => byte,] 1240 [ -Source => filename,] 1241 1242and this 1243 1244 [$db =] tie @arry, 'BerkeleyDB::Recno', 1245 [ -Filename => "filename", ] 1246 [ -Subname => "sub-database name", ] 1247 [ -Flags => flags,] 1248 [ -Property => flags,] 1249 [ -Mode => number,] 1250 [ -Cachesize => number,] 1251 [ -Lorder => number,] 1252 [ -Pagesize => number,] 1253 [ -Env => $env,] 1254 [ -Txn => $txn,] 1255 [ -Encrypt => { Password => "string", 1256 Flags => number }, ], 1257 # BerkeleyDB::Recno specific 1258 [ -Delim => byte,] 1259 [ -Len => number,] 1260 [ -Pad => byte,] 1261 [ -Source => filename,] 1262 1263=head2 A Recno Example 1264 1265Here is a simple example that uses RECNO (if you are using a version 1266of Perl earlier than 5.004_57 this example won't work -- see 1267L<Extra RECNO Methods> for a workaround). 1268 1269 use strict ; 1270 use BerkeleyDB ; 1271 1272 my $filename = "text" ; 1273 unlink $filename ; 1274 1275 my @h ; 1276 tie @h, 'BerkeleyDB::Recno', 1277 -Filename => $filename, 1278 -Flags => DB_CREATE, 1279 -Property => DB_RENUMBER 1280 or die "Cannot open $filename: $!\n" ; 1281 1282 # Add a few key/value pairs to the file 1283 $h[0] = "orange" ; 1284 $h[1] = "blue" ; 1285 $h[2] = "yellow" ; 1286 1287 push @h, "green", "black" ; 1288 1289 my $elements = scalar @h ; 1290 print "The array contains $elements entries\n" ; 1291 1292 my $last = pop @h ; 1293 print "popped $last\n" ; 1294 1295 unshift @h, "white" ; 1296 my $first = shift @h ; 1297 print "shifted $first\n" ; 1298 1299 # Check for existence of a key 1300 print "Element 1 Exists with value $h[1]\n" if $h[1] ; 1301 1302 untie @h ; 1303 1304Here is the output from the script: 1305 1306 The array contains 5 entries 1307 popped black 1308 shifted white 1309 Element 1 Exists with value blue 1310 The last element is green 1311 The 2nd last element is yellow 1312 1313=head1 BerkeleyDB::Queue 1314 1315Equivalent to calling B<db_create> followed by B<DB-E<gt>open> with 1316type B<DB_QUEUE> in Berkeley DB 3.x or greater. This database format 1317isn't available if you use Berkeley DB 2.x. 1318 1319Two forms of constructor are supported: 1320 1321 $db = new BerkeleyDB::Queue 1322 [ -Filename => "filename", ] 1323 [ -Subname => "sub-database name", ] 1324 [ -Flags => flags,] 1325 [ -Property => flags,] 1326 [ -Mode => number,] 1327 [ -Cachesize => number,] 1328 [ -Lorder => number,] 1329 [ -Pagesize => number,] 1330 [ -Env => $env,] 1331 [ -Txn => $txn,] 1332 [ -Encrypt => { Password => "string", 1333 Flags => number }, ], 1334 # BerkeleyDB::Queue specific 1335 [ -Len => number,] 1336 [ -Pad => byte,] 1337 [ -ExtentSize => number, ] 1338 1339and this 1340 1341 [$db =] tie @arry, 'BerkeleyDB::Queue', 1342 [ -Filename => "filename", ] 1343 [ -Subname => "sub-database name", ] 1344 [ -Flags => flags,] 1345 [ -Property => flags,] 1346 [ -Mode => number,] 1347 [ -Cachesize => number,] 1348 [ -Lorder => number,] 1349 [ -Pagesize => number,] 1350 [ -Env => $env,] 1351 [ -Txn => $txn,] 1352 [ -Encrypt => { Password => "string", 1353 Flags => number }, ], 1354 # BerkeleyDB::Queue specific 1355 [ -Len => number,] 1356 [ -Pad => byte,] 1357 1358 1359=head1 BerkeleyDB::Heap 1360 1361Equivalent to calling B<db_create> followed by B<DB-E<gt>open> with 1362type B<DB_HEAP> in Berkeley DB 5.2.x or greater. This database format 1363isn't available if you use an older version of Berkeley DB. 1364 1365One form of constructor is supported: 1366 1367 $db = new BerkeleyDB::Heap 1368 [ -Filename => "filename", ] 1369 [ -Subname => "sub-database name", ] 1370 [ -Flags => flags,] 1371 [ -Property => flags,] 1372 [ -Mode => number,] 1373 [ -Cachesize => number,] 1374 [ -Lorder => number,] 1375 [ -Pagesize => number,] 1376 [ -Env => $env,] 1377 [ -Txn => $txn,] 1378 [ -Encrypt => { Password => "string", 1379 Flags => number }, ], 1380 [ -BlobThreshold=> $number, ] 1381 [ -BlobDir => directory, ] 1382 # BerkeleyDB::Heap specific 1383 [ -HeapSize => number, ] 1384 [ -HeapSizeGb => number, ] 1385 1386=head1 BerkeleyDB::Unknown 1387 1388This class is used to open an existing database. 1389 1390Equivalent to calling B<db_open> with type B<DB_UNKNOWN> in Berkeley DB 2.x and 1391calling B<db_create> followed by B<DB-E<gt>open> with type B<DB_UNKNOWN> in 1392Berkeley DB 3.x or greater. 1393 1394The constructor looks like this: 1395 1396 $db = new BerkeleyDB::Unknown 1397 [ -Filename => "filename", ] 1398 [ -Subname => "sub-database name", ] 1399 [ -Flags => flags,] 1400 [ -Property => flags,] 1401 [ -Mode => number,] 1402 [ -Cachesize => number,] 1403 [ -Lorder => number,] 1404 [ -Pagesize => number,] 1405 [ -Env => $env,] 1406 [ -Txn => $txn,] 1407 [ -Encrypt => { Password => "string", 1408 Flags => number }, ], 1409 1410 1411=head2 An example 1412 1413=head1 COMMON OPTIONS 1414 1415All database access class constructors support the common set of 1416options defined below. All are optional. 1417 1418=over 5 1419 1420=item -Filename 1421 1422The database filename. If no filename is specified, a temporary file will 1423be created and removed once the program terminates. 1424 1425=item -Subname 1426 1427Specifies the name of the sub-database to open. 1428This option is only valid if you are using Berkeley DB 3.x or greater. 1429 1430=item -Flags 1431 1432Specify how the database will be opened/created. The valid flags are: 1433 1434B<DB_CREATE> 1435 1436Create any underlying files, as necessary. If the files do not already 1437exist and the B<DB_CREATE> flag is not specified, the call will fail. 1438 1439B<DB_NOMMAP> 1440 1441Not supported by BerkeleyDB. 1442 1443B<DB_RDONLY> 1444 1445Opens the database in read-only mode. 1446 1447B<DB_THREAD> 1448 1449Not supported by BerkeleyDB. 1450 1451B<DB_TRUNCATE> 1452 1453If the database file already exists, remove all the data before 1454opening it. 1455 1456=item -Mode 1457 1458Determines the file protection when the database is created. Defaults 1459to 0666. 1460 1461=item -Cachesize 1462 1463=item -Lorder 1464 1465=item -Pagesize 1466 1467=item -Env 1468 1469When working under a Berkeley DB environment, this parameter 1470 1471Defaults to no environment. 1472 1473=item -Encrypt 1474 1475If present, this parameter will enable encryption of all data before 1476it is written to the database. This parameters must be given a hash 1477reference. The format is shown below. 1478 1479 -Encrypt => { -Password => "abc", Flags => DB_ENCRYPT_AES } 1480 1481Valid values for the Flags are 0 or C<DB_ENCRYPT_AES>. 1482 1483This option requires Berkeley DB 4.1 or better. 1484 1485=item -Txn 1486 1487TODO. 1488 1489=back 1490 1491=head1 COMMON DATABASE METHODS 1492 1493All the database interfaces support the common set of methods defined 1494below. 1495 1496All the methods below return 0 to indicate success. 1497 1498=head2 $env = $db->Env(); 1499 1500Returns the environment object the database is associated with or C<undef> 1501when no environment was used when opening the database. 1502 1503=head2 $status = $db->db_get($key, $value [, $flags]) 1504 1505Given a key (C<$key>) this method reads the value associated with it 1506from the database. If it exists, the value read from the database is 1507returned in the C<$value> parameter. 1508 1509The B<$flags> parameter is optional. If present, it must be set to B<one> 1510of the following values: 1511 1512=over 5 1513 1514=item B<DB_GET_BOTH> 1515 1516When the B<DB_GET_BOTH> flag is specified, B<db_get> checks for the 1517existence of B<both> the C<$key> B<and> C<$value> in the database. 1518 1519=item B<DB_SET_RECNO> 1520 1521TODO. 1522 1523=back 1524 1525In addition, the following value may be set by bitwise OR'ing it into 1526the B<$flags> parameter: 1527 1528=over 5 1529 1530=item B<DB_RMW> 1531 1532TODO 1533 1534=back 1535 1536The variant C<db_pget> allows you to query a secondary database: 1537 1538 $status = $sdb->db_pget($skey, $pkey, $value); 1539 1540using the key C<$skey> in the secondary db to lookup C<$pkey> and C<$value> 1541from the primary db. 1542 1543=head2 $status = $db->db_exists($key [, $flags]) 1544 1545This method checks for the existence of the given key (C<$key>), but 1546does not read the value. If the key is not found, B<db_exists> will 1547return B<DB_NOTFOUND>. Requires BDB 4.6 or better. 1548 1549=head2 $status = $db->db_put($key, $value [, $flags]) 1550 1551Stores a key/value pair in the database. 1552 1553The B<$flags> parameter is optional. If present it must be set to B<one> 1554of the following values: 1555 1556=over 5 1557 1558=item B<DB_APPEND> 1559 1560This flag is only applicable when accessing a B<BerkeleyDB::Recno> 1561database. 1562 1563TODO. 1564 1565 1566=item B<DB_NOOVERWRITE> 1567 1568If this flag is specified and C<$key> already exists in the database, 1569the call to B<db_put> will return B<DB_KEYEXIST>. 1570 1571=back 1572 1573=head2 $status = $db->db_del($key [, $flags]) 1574 1575Deletes a key/value pair in the database associated with C<$key>. 1576If duplicate keys are enabled in the database, B<db_del> will delete 1577B<all> key/value pairs with key C<$key>. 1578 1579The B<$flags> parameter is optional and is currently unused. 1580 1581=head2 $status = $env->stat_print([$flags]) 1582 1583Prints statistical information. 1584 1585If the C<MsgFile> option is specified the output will be sent to the 1586file. Otherwise output is sent to standard output. 1587 1588This option requires Berkeley DB 4.3 or better. 1589 1590=head2 $status = $db->db_sync() 1591 1592If any parts of the database are in memory, write them to the database. 1593 1594=head2 $cursor = $db->db_cursor([$flags]) 1595 1596Creates a cursor object. This is used to access the contents of the 1597database sequentially. See L<CURSORS> for details of the methods 1598available when working with cursors. 1599 1600The B<$flags> parameter is optional. If present it must be set to B<one> 1601of the following values: 1602 1603=over 5 1604 1605=item B<DB_RMW> 1606 1607TODO. 1608 1609=back 1610 1611=head2 ($flag, $old_offset, $old_length) = $db->partial_set($offset, $length) ; 1612 1613TODO 1614 1615=head2 ($flag, $old_offset, $old_length) = $db->partial_clear() ; 1616 1617TODO 1618 1619=head2 $db->byteswapped() 1620 1621TODO 1622 1623=head2 $status = $db->get_blob_threshold($t1) ; 1624 1625Sets the parameter $t1 to the threshold value (in bytes) that is used to 1626determine when a data item is stored as a Blob. 1627 1628=head2 $status = $db->get_blob_dir($dir) ; 1629 1630Sets the $dir parameter to the directory where blob files are stored. 1631 1632=head2 $db->type() 1633 1634Returns the type of the database. The possible return code are B<DB_HASH> 1635for a B<BerkeleyDB::Hash> database, B<DB_BTREE> for a B<BerkeleyDB::Btree> 1636database and B<DB_RECNO> for a B<BerkeleyDB::Recno> database. This method 1637is typically used when a database has been opened with 1638B<BerkeleyDB::Unknown>. 1639 1640=head2 $bool = $env->cds_enabled(); 1641 1642Returns true if the Berkeley DB environment C<$env> has been opened on 1643CDS mode. 1644 1645=head2 $bool = $db->cds_enabled(); 1646 1647Returns true if the database C<$db> has been opened on CDS mode. 1648 1649=head2 $lock = $db->cds_lock(); 1650 1651Creates a CDS write lock object C<$lock>. 1652 1653It is a fatal error to attempt to create a cds_lock if the Berkeley DB 1654environment has not been opened in CDS mode. 1655 1656=head2 $lock->cds_unlock(); 1657 1658Removes a CDS lock. The destruction of the CDS lock object automatically 1659calls this method. 1660 1661Note that if multiple CDS lock objects are created, the underlying write 1662lock will not be released until all CDS lock objects are either explicitly 1663unlocked with this method, or the CDS lock objects have been destroyed. 1664 1665=head2 $ref = $db->db_stat() 1666 1667Returns a reference to an associative array containing information about 1668the database. The keys of the associative array correspond directly to the 1669names of the fields defined in the Berkeley DB documentation. For example, 1670in the DB documentation, the field B<bt_version> stores the version of the 1671Btree database. Assuming you called B<db_stat> on a Btree database the 1672equivalent field would be accessed as follows: 1673 1674 $version = $ref->{'bt_version'} ; 1675 1676If you are using Berkeley DB 3.x or better, this method will work will 1677all database formats. When DB 2.x is used, it only works with 1678B<BerkeleyDB::Btree>. 1679 1680=head2 $status = $db->status() 1681 1682Returns the status of the last C<$db> method called. 1683 1684=head2 $status = $db->truncate($count) 1685 1686Truncates the database and returns the number or records deleted 1687in C<$count>. 1688 1689=head2 $status = $db->compact($start, $stop, $c_data, $flags, $end); 1690 1691Compacts the database C<$db>. 1692 1693All the parameters are optional - if only want to make use of some of them, 1694use C<undef> for those you don't want. Trailing unused parameters can be 1695omitted. For example, if you only want to use the C<$c_data> parameter to 1696set the C<compact_fillpercent>, write you code like this 1697 1698 my %hash; 1699 $hash{compact_fillpercent} = 50; 1700 $db->compact(undef, undef, \%hash); 1701 1702The parameters operate identically to the C equivalent of this method. 1703The C<$c_data> needs a bit of explanation - it must be a hash reference. 1704The values of the following keys can be set before calling C<compact> and 1705will affect the operation of the compaction. 1706 1707=over 5 1708 1709=item * compact_fillpercent 1710 1711=item * compact_timeout 1712 1713=back 1714 1715The following keys, along with associated values, will be created in the 1716hash reference if the C<compact> operation was successful. 1717 1718=over 5 1719 1720=item * compact_deadlock 1721 1722=item * compact_levels 1723 1724=item * compact_pages_free 1725 1726=item * compact_pages_examine 1727 1728=item * compact_pages_truncated 1729 1730=back 1731 1732You need to be running Berkeley DB 4.4 or better if you want to make use of 1733C<compact>. 1734 1735=head2 $status = $db->associate($secondary, \&key_callback) 1736 1737Associate C<$db> with the secondary DB C<$secondary> 1738 1739New key/value pairs inserted to the database will be passed to the callback 1740which must set its third argument to the secondary key to allow lookup. If 1741an array reference is set multiple keys secondary keys will be associated 1742with the primary database entry. 1743 1744Data may be retrieved fro the secondary database using C<db_pget> to also 1745obtain the primary key. 1746 1747Secondary databased are maintained automatically. 1748 1749=head2 $status = $db->associate_foreign($secondary, callback, $flags) 1750 1751Associate a foreign key database C<$db> with the secondary DB 1752C<$secondary>. 1753 1754The second parameter must be a reference to a sub or C<undef>. 1755 1756The C<$flags> parameter must be either C<DB_FOREIGN_CASCADE>, 1757C<DB_FOREIGN_ABORT> or C<DB_FOREIGN_NULLIFY>. 1758 1759When the flags parameter is C<DB_FOREIGN_NULLIFY> the second parameter is a 1760reference to a sub of the form 1761 1762 sub foreign_cb 1763 { 1764 my $key = \$_[0]; 1765 my $value = \$_[1]; 1766 my $foreignkey = \$_[2]; 1767 my $changed = \$_[3] ; 1768 1769 # for ... set $$value and set $$changed to 1 1770 1771 return 0; 1772 } 1773 1774 $foreign_db->associate_foreign($secondary, \&foreign_cb, DB_FOREIGN_NULLIFY); 1775 1776=head1 CURSORS 1777 1778A cursor is used whenever you want to access the contents of a database 1779in sequential order. 1780A cursor object is created with the C<db_cursor> 1781 1782A cursor object has the following methods available: 1783 1784=head2 $newcursor = $cursor->c_dup($flags) 1785 1786Creates a duplicate of C<$cursor>. This method needs Berkeley DB 3.0.x or better. 1787 1788The C<$flags> parameter is optional and can take the following value: 1789 1790=over 5 1791 1792=item DB_POSITION 1793 1794When present this flag will position the new cursor at the same place as the 1795existing cursor. 1796 1797=back 1798 1799=head2 $status = $cursor->c_get($key, $value, $flags) 1800 1801Reads a key/value pair from the database, returning the data in C<$key> 1802and C<$value>. The key/value pair actually read is controlled by the 1803C<$flags> parameter, which can take B<one> of the following values: 1804 1805=over 5 1806 1807=item B<DB_FIRST> 1808 1809Set the cursor to point to the first key/value pair in the 1810database. Return the key/value pair in C<$key> and C<$value>. 1811 1812=item B<DB_LAST> 1813 1814Set the cursor to point to the last key/value pair in the database. Return 1815the key/value pair in C<$key> and C<$value>. 1816 1817=item B<DB_NEXT> 1818 1819If the cursor is already pointing to a key/value pair, it will be 1820incremented to point to the next key/value pair and return its contents. 1821 1822If the cursor isn't initialised, B<DB_NEXT> works just like B<DB_FIRST>. 1823 1824If the cursor is already positioned at the last key/value pair, B<c_get> 1825will return B<DB_NOTFOUND>. 1826 1827=item B<DB_NEXT_DUP> 1828 1829This flag is only valid when duplicate keys have been enabled in 1830a database. 1831If the cursor is already pointing to a key/value pair and the key of 1832the next key/value pair is identical, the cursor will be incremented to 1833point to it and their contents returned. 1834 1835=item B<DB_PREV> 1836 1837If the cursor is already pointing to a key/value pair, it will be 1838decremented to point to the previous key/value pair and return its 1839contents. 1840 1841If the cursor isn't initialised, B<DB_PREV> works just like B<DB_LAST>. 1842 1843If the cursor is already positioned at the first key/value pair, B<c_get> 1844will return B<DB_NOTFOUND>. 1845 1846=item B<DB_CURRENT> 1847 1848If the cursor has been set to point to a key/value pair, return their 1849contents. 1850If the key/value pair referenced by the cursor has been deleted, B<c_get> 1851will return B<DB_KEYEMPTY>. 1852 1853=item B<DB_SET> 1854 1855Set the cursor to point to the key/value pair referenced by B<$key> 1856and return the value in B<$value>. 1857 1858=item B<DB_SET_RANGE> 1859 1860This flag is a variation on the B<DB_SET> flag. As well as returning 1861the value, it also returns the key, via B<$key>. 1862When used with a B<BerkeleyDB::Btree> database the key matched by B<c_get> 1863will be the shortest key (in length) which is greater than or equal to 1864the key supplied, via B<$key>. This allows partial key searches. 1865See ??? for an example of how to use this flag. 1866 1867=item B<DB_GET_BOTH> 1868 1869Another variation on B<DB_SET>. This one returns both the key and 1870the value. 1871 1872=item B<DB_SET_RECNO> 1873 1874TODO. 1875 1876=item B<DB_GET_RECNO> 1877 1878TODO. 1879 1880=back 1881 1882In addition, the following value may be set by bitwise OR'ing it into 1883the B<$flags> parameter: 1884 1885=over 5 1886 1887=item B<DB_RMW> 1888 1889TODO. 1890 1891=back 1892 1893=head2 $status = $cursor->c_put($key, $value, $flags) 1894 1895Stores the key/value pair in the database. The position that the data is 1896stored in the database is controlled by the C<$flags> parameter, which 1897must take B<one> of the following values: 1898 1899=over 5 1900 1901=item B<DB_AFTER> 1902 1903When used with a Btree or Hash database, a duplicate of the key referenced 1904by the current cursor position will be created and the contents of 1905B<$value> will be associated with it - B<$key> is ignored. 1906The new key/value pair will be stored immediately after the current 1907cursor position. 1908Obviously the database has to have been opened with B<DB_DUP>. 1909 1910When used with a Recno ... TODO 1911 1912 1913=item B<DB_BEFORE> 1914 1915When used with a Btree or Hash database, a duplicate of the key referenced 1916by the current cursor position will be created and the contents of 1917B<$value> will be associated with it - B<$key> is ignored. 1918The new key/value pair will be stored immediately before the current 1919cursor position. 1920Obviously the database has to have been opened with B<DB_DUP>. 1921 1922When used with a Recno ... TODO 1923 1924=item B<DB_CURRENT> 1925 1926If the cursor has been initialised, replace the value of the key/value 1927pair stored in the database with the contents of B<$value>. 1928 1929=item B<DB_KEYFIRST> 1930 1931Only valid with a Btree or Hash database. This flag is only really 1932used when duplicates are enabled in the database and sorted duplicates 1933haven't been specified. 1934In this case the key/value pair will be inserted as the first entry in 1935the duplicates for the particular key. 1936 1937=item B<DB_KEYLAST> 1938 1939Only valid with a Btree or Hash database. This flag is only really 1940used when duplicates are enabled in the database and sorted duplicates 1941haven't been specified. 1942In this case the key/value pair will be inserted as the last entry in 1943the duplicates for the particular key. 1944 1945=back 1946 1947=head2 $status = $cursor->c_del([$flags]) 1948 1949This method deletes the key/value pair associated with the current cursor 1950position. The cursor position will not be changed by this operation, so 1951any subsequent cursor operation must first initialise the cursor to 1952point to a valid key/value pair. 1953 1954If the key/value pair associated with the cursor have already been 1955deleted, B<c_del> will return B<DB_KEYEMPTY>. 1956 1957The B<$flags> parameter is not used at present. 1958 1959=head2 $status = $cursor->c_count($cnt [, $flags]) 1960 1961Stores the number of duplicates at the current cursor position in B<$cnt>. 1962 1963The B<$flags> parameter is not used at present. This method needs 1964Berkeley DB 3.1 or better. 1965 1966=head2 $status = $cursor->status() 1967 1968Returns the status of the last cursor method as a dual type. 1969 1970=head2 $status = $cursor->c_pget() ; 1971 1972See C<db_pget> 1973 1974=head2 $status = $cursor->c_close() 1975 1976Closes the cursor B<$cursor>. 1977 1978=head2 $stream = $cursor->db_stream($flags); 1979 1980Create a BerkeleyDB::DbStream object to read the blob at the current cursor location. 1981See L<Blob> for details of the the BerkeleyDB::DbStream object. 1982 1983$flags must be one or more of the following OR'ed together 1984 1985DB_STREAM_READ 1986DB_STREAM_WRITE 1987DB_STREAM_SYNC_WRITE 1988 1989For full information on the flags refer to the Berkeley DB Reference Guide. 1990 1991=head2 Cursor Examples 1992 1993TODO 1994 1995Iterating from first to last, then in reverse. 1996 1997examples of each of the flags. 1998 1999=head1 JOIN 2000 2001Join support for BerkeleyDB is in progress. Watch this space. 2002 2003TODO 2004 2005=head1 TRANSACTIONS 2006 2007Transactions are created using the C<txn_begin> method on L<BerkeleyDB::Env>: 2008 2009 my $txn = $env->txn_begin; 2010 2011If this is a nested transaction, supply the parent transaction as an 2012argument: 2013 2014 my $child_txn = $env->txn_begin($parent_txn); 2015 2016Then in order to work with the transaction, you must set it as the current 2017transaction on the database handles you want to work with: 2018 2019 $db->Txn($txn); 2020 2021Or for multiple handles: 2022 2023 $txn->Txn(@handles); 2024 2025The current transaction is given by BerkeleyDB each time to the various BDB 2026operations. In the C api it is required explicitly as an argument to every 2027operation. 2028 2029To commit a transaction call the C<commit> method on it: 2030 2031 $txn->txn_commit; 2032 2033and to roll back call abort: 2034 2035 $txn->txn_abort 2036 2037After committing or aborting a child transaction you need to set the active 2038transaction again using C<Txn>. 2039 2040=head1 BerkeleyDBB::DbStream -- support for BLOB 2041 2042Blob support is available in Berkeley DB starting with version 6.0. Refer 2043to the section "Blob Support" in the Berkeley DB Programmer Reference for 2044details of how Blob supports works. 2045 2046A Blob is access via a BerkeleyDBB::DbStream object. This is created via a 2047cursor object. 2048 2049 # Note - error handling not shown below. 2050 2051 # Set the key we want 2052 my $k = "some key"; 2053 2054 # Don't want the value retrieved by the cursor, 2055 # so use partial_set to make sure no data is retrieved. 2056 my $v = ''; 2057 $cursor->partial_set(0,0) ; 2058 $cursor->c_get($k, $v, DB_SET) ; 2059 $cursor->partial_clear() ; 2060 2061 # Now create a stream to the blob 2062 my $stream = $cursor->db_stream(DB_STREAM_WRITE) ; 2063 2064 # get the size of the blob 2065 $stream->size(my $s) ; 2066 2067 # Read the first 1k of data from the blob 2068 my $data ; 2069 $stream->read($data, 0, 1024); 2070 2071A BerkeleyDB::DbStream object has the following methods available: 2072 2073 2074=head2 $status = $stream->size($SIZE); 2075 2076Outputs the length of the Blob in the $SIZE parameter. 2077 2078=head2 $status = $stream->read($data, $offset, $size); 2079 2080Read from the blob. $offset is the number of bytes from the start of the 2081blob to read from. $size if the number of bytes to read. 2082 2083=head2 $status = $stream->write($data, $offset, $flags); 2084 2085Write $data to the blob, starting at offset $offset. 2086 2087Example 2088 2089Below is an example of how to walk through a database when you don't know 2090beforehand which entries are blobs and which are not. 2091 2092 while (1) 2093 { 2094 my $k = ''; 2095 my $v = ''; 2096 $cursor->partial_set(0,0) ; 2097 my $status = $cursor->c_get($k, $v, DB_NEXT) ; 2098 $cursor->partial_clear(); 2099 2100 last if $status != 0 ; 2101 2102 my $stream = $cursor->db_stream(DB_STREAM_WRITE); 2103 2104 if (defined $stream) 2105 { 2106 # It's a Blob 2107 $stream->size(my $s) ; 2108 } 2109 else 2110 { 2111 # Not a Blob 2112 $cursor->c_get($k, $v, DB_CURRENT) ; 2113 } 2114 } 2115 2116=head1 Berkeley DB Concurrent Data Store (CDS) 2117 2118The Berkeley DB I<Concurrent Data Store> (CDS) is a lightweight locking 2119mechanism that is useful in scenarios where transactions are overkill. 2120 2121=head2 What is CDS? 2122 2123The Berkeley DB CDS interface is a simple lightweight locking mechanism 2124that allows safe concurrent access to Berkeley DB databases. Your 2125application can have multiple reader and write processes, but Berkeley DB 2126will arrange it so that only one process can have a write lock against the 2127database at a time, i.e. multiple processes can read from a database 2128concurrently, but all write processes will be serialised. 2129 2130=head2 Should I use it? 2131 2132Whilst this simple locking model is perfectly adequate for some 2133applications, it will be too restrictive for others. Before deciding on 2134using CDS mode, you need to be sure that it is suitable for the expected 2135behaviour of your application. 2136 2137The key features of this model are 2138 2139=over 5 2140 2141=item * 2142 2143All writes operations are serialised. 2144 2145=item * 2146 2147A write operation will block until all reads have finished. 2148 2149=back 2150 2151There are a few of the attributes of your application that you need to be 2152aware of before choosing to use CDS. 2153 2154Firstly, if you application needs either recoverability or transaction 2155support, then CDS will not be suitable. 2156 2157Next what is the ratio of read operation to write operations will your 2158application have? 2159 2160If it is carrying out mostly read operations, and very few writes, then CDS 2161may be appropriate. 2162 2163What is the expected throughput of reads/writes in your application? 2164 2165If you application does 90% writes and 10% reads, but on average you only 2166have a transaction every 5 seconds, then the fact that all writes are 2167serialised will not matter, because there will hardly ever be multiple 2168writes processes blocking. 2169 2170In summary CDS mode may be appropriate for your application if it performs 2171mostly reads and very few writes or there is a low throughput. Also, if 2172you do not need to be able to roll back a series of database operations if 2173an error occurs, then CDS is ok. 2174 2175If any of these is not the case you will need to use Berkeley DB 2176transactions. That is outside the scope of this document. 2177 2178=head2 Locking Used 2179 2180Berkeley DB implements CDS mode using two kinds of lock behind the scenes - 2181namely read locks and write locks. A read lock allows multiple processes to 2182access the database for reading at the same time. A write lock will only 2183get access to the database when there are no read or write locks active. 2184The write lock will block until the process holding the lock releases it. 2185 2186Multiple processes with read locks can all access the database at the same 2187time as long as no process has a write lock. A process with a write lock 2188can only access the database if there are no other active read or write 2189locks. 2190 2191The majority of the time the Berkeley DB CDS mode will handle all locking 2192without your application having to do anything. There are a couple of 2193exceptions you need to be aware of though - these will be discussed in 2194L<Safely Updating Records> and L<Implicit Cursors> below. 2195 2196A Berkeley DB Cursor (created with C<< $db->db_cursor >>) will by hold a 2197lock on the database until it is either explicitly closed or destroyed. 2198This means the lock has the potential to be long lived. 2199 2200By default Berkeley DB cursors create a read lock, but it is possible to 2201create a cursor that holds a write lock, thus 2202 2203 $cursor = $db->db_cursor(DB_WRITECURSOR); 2204 2205 2206Whilst either a read or write cursor is active, it will block any other 2207processes that wants to write to the database. 2208 2209To avoid blocking problems, only keep cursors open as long as they are 2210needed. The same is true when you use the C<cursor> method or the 2211C<cds_lock> method. 2212 2213For full information on CDS see the "Berkeley DB Concurrent Data Store 2214applications" section in the Berkeley DB Reference Guide. 2215 2216 2217=head2 Opening a database for CDS 2218 2219Here is the typical signature that is used when opening a database in CDS 2220mode. 2221 2222 use BerkeleyDB ; 2223 2224 my $env = new BerkeleyDB::Env 2225 -Home => "./home" , 2226 -Flags => DB_CREATE| DB_INIT_CDB | DB_INIT_MPOOL 2227 or die "cannot open environment: $BerkeleyDB::Error\n"; 2228 2229 my $db = new BerkeleyDB::Hash 2230 -Filename => 'test1.db', 2231 -Flags => DB_CREATE, 2232 -Env => $env 2233 or die "cannot open database: $BerkeleyDB::Error\n"; 2234 2235or this, if you use the tied interface 2236 2237 tie %hash, "BerkeleyDB::Hash", 2238 -Filename => 'test2.db', 2239 -Flags => DB_CREATE, 2240 -Env => $env 2241 or die "cannot open database: $BerkeleyDB::Error\n"; 2242 2243The first thing to note is that you B<MUST> always use a Berkeley DB 2244environment if you want to use locking with Berkeley DB. 2245 2246Remember, that apart from the actual database files you explicitly create 2247yourself, Berkeley DB will create a few behind the scenes to handle locking 2248- they usually have names like "__db.001". It is therefore a good idea to 2249use the C<-Home> option, unless you are happy for all these files to be 2250written in the current directory. 2251 2252Next, remember to include the C<DB_CREATE> flag when opening the 2253environment for the first time. A common mistake is to forget to add this 2254option and then wonder why the application doesn't work. 2255 2256Finally, it is vital that all processes that are going to access the 2257database files use the same Berkeley DB environment. 2258 2259 2260=head2 Safely Updating a Record 2261 2262One of the main gotchas when using CDS is if you want to update a record in 2263a database, i.e. you want to retrieve a record from a database, modify it 2264in some way and put it back in the database. 2265 2266For example, say you are writing a web application and you want to keep a 2267record of the number of times your site is accessed in a Berkeley DB 2268database. So your code will have a line of code like this (assume, of 2269course, that C<%hash> has been tied to a Berkeley DB database): 2270 2271 $hash{Counter} ++ ; 2272 2273That may look innocent enough, but there is a race condition lurking in 2274there. If I rewrite the line of code using the low-level Berkeley DB API, 2275which is what will actually be executed, the race condition may be more 2276apparent: 2277 2278 $db->db_get("Counter", $value); 2279 ++ $value ; 2280 $db->db_put("Counter", $value); 2281 2282Consider what happens behind the scenes when you execute the commands 2283above. Firstly, the existing value for the key "Counter" is fetched from 2284the database using C<db_get>. A read lock will be used for this part of the 2285update. The value is then incremented, and the new value is written back 2286to the database using C<db_put>. This time a write lock will be used. 2287 2288Here's the problem - there is nothing to stop two (or more) processes 2289executing the read part at the same time. Remember multiple processes can 2290hold a read lock on the database at the same time. So both will fetch the 2291same value, let's say 7, from the database. Both increment the value to 8 2292and attempt to write it to the database. Berkeley DB will ensure that only 2293one of the processes gets a write lock, while the other will be blocked. So 2294the process that happened to get the write lock will store the value 8 to 2295the database and release the write lock. Now the other process will be 2296unblocked, and it too will write the value 8 to the database. The result, 2297in this example, is we have missed a hit in the counter. 2298 2299To deal with this kind of scenario, you need to make the update atomic. A 2300convenience method, called C<cds_lock>, is supplied with the BerkeleyDB 2301module for this purpose. Using C<cds_lock>, the counter update code can now 2302be rewritten thus: 2303 2304 my $lk = $dbh->cds_lock() ; 2305 $hash{Counter} ++ ; 2306 $lk->cds_unlock; 2307 2308or this, where scoping is used to limit the lifetime of the lock object 2309 2310 { 2311 my $lk = $dbh->cds_lock() ; 2312 $hash{Counter} ++ ; 2313 } 2314 2315Similarly, C<cds_lock> can be used with the native Berkeley DB API 2316 2317 my $lk = $dbh->cds_lock() ; 2318 $db->db_get("Counter", $value); 2319 ++ $value ; 2320 $db->db_put("Counter", $value); 2321 $lk->unlock; 2322 2323 2324The C<cds_lock> method will ensure that the current process has exclusive 2325access to the database until the lock is either explicitly released, via 2326the C<< $lk->cds_unlock() >> or by the lock object being destroyed. 2327 2328If you are interested, all that C<cds_lock> does is open a "write" cursor. 2329This has the useful side-effect of holding a write-lock on the database 2330until the cursor is deleted. This is how you create a write-cursor 2331 2332 $cursor = $db->db_cursor(DB_WRITECURSOR); 2333 2334If you have instantiated multiple C<cds_lock> objects for one database 2335within a single process, that process will hold a write-lock on the 2336database until I<ALL> C<cds_lock> objects have been destroyed. 2337 2338As with all write-cursors, you should try to limit the scope of the 2339C<cds_lock> to as short a time as possible. Remember the complete database 2340will be locked to other process whilst the write lock is in place. 2341 2342=head2 Cannot write with a read cursor while a write cursor is active 2343 2344This issue is easier to demonstrate with an example, so consider the code 2345below. The intention of the code is to increment the values of all the 2346elements in a database by one. 2347 2348 # Assume $db is a database opened in a CDS environment. 2349 2350 # Create a write-lock 2351 my $lock = $db->db_cursor(DB_WRITECURSOR); 2352 # or 2353 # my $lock = $db->cds_lock(); 2354 2355 2356 my $cursor = $db->db_cursor(); 2357 2358 # Now loop through the database, and increment 2359 # each value using c_put. 2360 while ($cursor->c_get($key, $value, DB_NEXT) == 0) 2361 { 2362 $cursor->c_put($key, $value+1, DB_CURRENT) == 0 2363 or die "$BerkeleyDB::Error\n"; 2364 } 2365 2366 2367When this code is run, it will fail on the C<c_put> line with this error 2368 2369 Write attempted on read-only cursor 2370 2371The read cursor has automatically disallowed a write operation to prevent a 2372deadlock. 2373 2374 2375So the rule is -- you B<CANNOT> carry out a write operation using a 2376read-only cursor (i.e. you cannot use C<c_put> or C<c_del>) whilst another 2377write-cursor is already active. 2378 2379The workaround for this issue is to just use C<db_put> instead of C<c_put>, 2380like this 2381 2382 # Assume $db is a database opened in a CDS environment. 2383 2384 # Create a write-lock 2385 my $lock = $db->db_cursor(DB_WRITECURSOR); 2386 # or 2387 # my $lock = $db->cds_lock(); 2388 2389 2390 my $cursor = $db->db_cursor(); 2391 2392 # Now loop through the database, and increment 2393 # each value using c_put. 2394 while ($cursor->c_get($key, $value, DB_NEXT) == 0) 2395 { 2396 $db->db_put($key, $value+1) == 0 2397 or die "$BerkeleyDB::Error\n"; 2398 } 2399 2400 2401 2402=head2 Implicit Cursors 2403 2404All Berkeley DB cursors will hold either a read lock or a write lock on the 2405database for the existence of the cursor. In order to prevent blocking of 2406other processes you need to make sure that they are not long lived. 2407 2408There are a number of instances where the Perl interface to Berkeley DB 2409will create a cursor behind the scenes without you being aware of it. Most 2410of these are very short-lived and will not affect the running of your 2411script, but there are a few notable exceptions. 2412 2413Consider this snippet of code 2414 2415 while (my ($k, $v) = each %hash) 2416 { 2417 # do something 2418 } 2419 2420 2421To implement the "each" functionality, a read cursor will be created behind 2422the scenes to allow you to iterate through the tied hash, C<%hash>. While 2423that cursor is still active, a read lock will obviously be held against the 2424database. If your application has any other writing processes, these will 2425be blocked until the read cursor is closed. That won't happen until the 2426loop terminates. 2427 2428To avoid blocking problems, only keep cursors open as long as they are 2429needed. The same is true when you use the C<cursor> method or the 2430C<cds_lock> method. 2431 2432 2433The locking behaviour of the C<values> or C<keys> functions, shown below, 2434is subtly different. 2435 2436 foreach my $k (keys %hash) 2437 { 2438 # do something 2439 } 2440 2441 foreach my $v (values %hash) 2442 { 2443 # do something 2444 } 2445 2446 2447Just as in the C<each> function, a read cursor will be created to iterate 2448over the database in both of these cases. Where C<keys> and C<values> 2449differ is the place where the cursor carries out the iteration through the 2450database. Whilst C<each> carried out a single iteration every time it was 2451invoked, the C<keys> and C<values> functions will iterate through the 2452entire database in one go -- the complete database will be read into memory 2453before the first iteration of the loop. 2454 2455Apart from the fact that a read lock will be held for the amount of time 2456required to iterate through the database, the use of C<keys> and C<values> 2457is B<not> recommended because it will result in the complete database being 2458read into memory. 2459 2460 2461=head2 Avoiding Deadlock with multiple databases 2462 2463If your CDS application uses multiple database files, and you need to write 2464to more than one of them, you need to be careful you don't create a 2465deadlock. 2466 2467For example, say you have two databases, D1 and D2, and two processes, P1 2468and P2. Assume you want to write a record to each database. If P1 writes 2469the records to the databases in the order D1, D2 while process P2 writes 2470the records in the order D2, D1, there is the potential for a deadlock to 2471occur. 2472 2473This scenario can be avoided by either always acquiring the write locks in 2474exactly the same order in your application code, or by using the 2475C<DB_CDB_ALLDB> flag when opening the environment. This flag will make a 2476write-lock apply to all the databases in the environment. 2477 2478Add example here 2479 2480=head1 DBM Filters 2481 2482A DBM Filter is a piece of code that is be used when you I<always> 2483want to make the same transformation to all keys and/or values in a DBM 2484database. All of the database classes (BerkeleyDB::Hash, 2485BerkeleyDB::Btree and BerkeleyDB::Recno) support DBM Filters. 2486 2487An example is when you need to encode your data in UTF-8 before writing to 2488the database and then decode the UTF-8 when reading from the database file. 2489 2490There are two ways to use a DBM Filter. 2491 2492=over 5 2493 2494=item 1. 2495 2496Using the low-level API defined below. 2497 2498=item 2. 2499 2500Using the L<DBM_Filter> module. 2501This module hides the complexity of the API defined below and comes 2502with a number of "canned" filters that cover some of the common use-cases. 2503 2504=back 2505 2506Use of the L<DBM_Filter> module is recommended. 2507 2508=head2 DBM Filter Low-level API 2509 2510There are four methods associated with DBM Filters. All work 2511identically, and each is used to install (or uninstall) a single DBM 2512Filter. Each expects a single parameter, namely a reference to a sub. 2513The only difference between them is the place that the filter is 2514installed. 2515 2516To summarise: 2517 2518=over 5 2519 2520=item B<filter_store_key> 2521 2522If a filter has been installed with this method, it will be invoked 2523every time you write a key to a DBM database. 2524 2525=item B<filter_store_value> 2526 2527If a filter has been installed with this method, it will be invoked 2528every time you write a value to a DBM database. 2529 2530 2531=item B<filter_fetch_key> 2532 2533If a filter has been installed with this method, it will be invoked 2534every time you read a key from a DBM database. 2535 2536=item B<filter_fetch_value> 2537 2538If a filter has been installed with this method, it will be invoked 2539every time you read a value from a DBM database. 2540 2541=back 2542 2543You can use any combination of the methods, from none, to all four. 2544 2545All filter methods return the existing filter, if present, or C<undef> 2546in not. 2547 2548To delete a filter pass C<undef> to it. 2549 2550=head2 The Filter 2551 2552When each filter is called by Perl, a local copy of C<$_> will contain 2553the key or value to be filtered. Filtering is achieved by modifying 2554the contents of C<$_>. The return code from the filter is ignored. 2555 2556=head2 An Example -- the NULL termination problem. 2557 2558Consider the following scenario. You have a DBM database that you need 2559to share with a third-party C application. The C application assumes 2560that I<all> keys and values are NULL terminated. Unfortunately when 2561Perl writes to DBM databases it doesn't use NULL termination, so your 2562Perl application will have to manage NULL termination itself. When you 2563write to the database you will have to use something like this: 2564 2565 $hash{"$key\0"} = "$value\0" ; 2566 2567Similarly the NULL needs to be taken into account when you are considering 2568the length of existing keys/values. 2569 2570It would be much better if you could ignore the NULL terminations issue 2571in the main application code and have a mechanism that automatically 2572added the terminating NULL to all keys and values whenever you write to 2573the database and have them removed when you read from the database. As I'm 2574sure you have already guessed, this is a problem that DBM Filters can 2575fix very easily. 2576 2577 use strict ; 2578 use BerkeleyDB ; 2579 2580 my %hash ; 2581 my $filename = "filt.db" ; 2582 unlink $filename ; 2583 2584 my $db = tie %hash, 'BerkeleyDB::Hash', 2585 -Filename => $filename, 2586 -Flags => DB_CREATE 2587 or die "Cannot open $filename: $!\n" ; 2588 2589 # Install DBM Filters 2590 $db->filter_fetch_key ( sub { s/\0$// } ) ; 2591 $db->filter_store_key ( sub { $_ .= "\0" } ) ; 2592 $db->filter_fetch_value( sub { s/\0$// } ) ; 2593 $db->filter_store_value( sub { $_ .= "\0" } ) ; 2594 2595 $hash{"abc"} = "def" ; 2596 my $a = $hash{"ABC"} ; 2597 # ... 2598 undef $db ; 2599 untie %hash ; 2600 2601Hopefully the contents of each of the filters should be 2602self-explanatory. Both "fetch" filters remove the terminating NULL, 2603and both "store" filters add a terminating NULL. 2604 2605 2606=head2 Another Example -- Key is a C int. 2607 2608Here is another real-life example. By default, whenever Perl writes to 2609a DBM database it always writes the key and value as strings. So when 2610you use this: 2611 2612 $hash{12345} = "something" ; 2613 2614the key 12345 will get stored in the DBM database as the 5 byte string 2615"12345". If you actually want the key to be stored in the DBM database 2616as a C int, you will have to use C<pack> when writing, and C<unpack> 2617when reading. 2618 2619Here is a DBM Filter that does it: 2620 2621 use strict ; 2622 use BerkeleyDB ; 2623 my %hash ; 2624 my $filename = "filt.db" ; 2625 unlink $filename ; 2626 2627 2628 my $db = tie %hash, 'BerkeleyDB::Btree', 2629 -Filename => $filename, 2630 -Flags => DB_CREATE 2631 or die "Cannot open $filename: $!\n" ; 2632 2633 $db->filter_fetch_key ( sub { $_ = unpack("i", $_) } ) ; 2634 $db->filter_store_key ( sub { $_ = pack ("i", $_) } ) ; 2635 $hash{123} = "def" ; 2636 # ... 2637 undef $db ; 2638 untie %hash ; 2639 2640This time only two filters have been used -- we only need to manipulate 2641the contents of the key, so it wasn't necessary to install any value 2642filters. 2643 2644=head1 Using BerkeleyDB with MLDBM 2645 2646Both BerkeleyDB::Hash and BerkeleyDB::Btree can be used with the MLDBM 2647module. The code fragment below shows how to open associate MLDBM with 2648BerkeleyDB::Btree. To use BerkeleyDB::Hash just replace 2649BerkeleyDB::Btree with BerkeleyDB::Hash. 2650 2651 use strict ; 2652 use BerkeleyDB ; 2653 use MLDBM qw(BerkeleyDB::Btree) ; 2654 use Data::Dumper; 2655 2656 my $filename = 'testmldbm' ; 2657 my %o ; 2658 2659 unlink $filename ; 2660 tie %o, 'MLDBM', -Filename => $filename, 2661 -Flags => DB_CREATE 2662 or die "Cannot open database '$filename: $!\n"; 2663 2664See the MLDBM documentation for information on how to use the module 2665and for details of its limitations. 2666 2667=head1 EXAMPLES 2668 2669TODO. 2670 2671=head1 HINTS & TIPS 2672 2673=head2 Sharing Databases With C Applications 2674 2675There is no technical reason why a Berkeley DB database cannot be 2676shared by both a Perl and a C application. 2677 2678The vast majority of problems that are reported in this area boil down 2679to the fact that C strings are NULL terminated, whilst Perl strings 2680are not. See L<An Example -- the NULL termination problem.> in the DBM 2681FILTERS section for a generic way to work around this problem. 2682 2683 2684=head2 The untie Gotcha 2685 2686TODO 2687 2688=head1 COMMON QUESTIONS 2689 2690This section attempts to answer some of the more common questions that 2691I get asked. 2692 2693 2694=head2 Relationship with DB_File 2695 2696Before Berkeley DB 2.x was written there was only one Perl module that 2697interfaced to Berkeley DB. That module is called B<DB_File>. Although 2698B<DB_File> can be build with Berkeley DB 1.x, 2.x, 3.x or 4.x, it only 2699provides an interface to the functionality available in Berkeley DB 27001.x. That means that it doesn't support transactions, locking or any of 2701the other new features available in DB 2.x or better. 2702 2703=head2 How do I store Perl data structures with BerkeleyDB? 2704 2705See L<Using BerkeleyDB with MLDBM>. 2706 2707=head1 HISTORY 2708 2709See the Changes file. 2710 2711=head1 AVAILABILITY 2712 2713The most recent version of B<BerkeleyDB> can always be found 2714on CPAN (see L<perlmod/CPAN> for details), in the directory 2715F<modules/by-module/BerkeleyDB>. 2716 2717The official web site for Berkeley DB is F<http://www.oracle.com/technology/products/berkeley-db/db/index.html>. 2718 2719=head1 COPYRIGHT 2720 2721Copyright (c) 1997-2018 Paul Marquess. All rights reserved. This program 2722is free software; you can redistribute it and/or modify it under the 2723same terms as Perl itself. 2724 2725Although B<BerkeleyDB> is covered by the Perl license, the library it 2726makes use of, namely Berkeley DB, is not. Berkeley DB has its own 2727copyright and its own license. Please take the time to read it. 2728 2729Here are few words taken from the Berkeley DB FAQ (at 2730F<http://www.oracle.com/technology/products/berkeley-db/db/index.html>) regarding the license: 2731 2732 Do I have to license DB to use it in Perl scripts? 2733 2734 No. The Berkeley DB license requires that software that uses 2735 Berkeley DB be freely redistributable. In the case of Perl, that 2736 software is Perl, and not your scripts. Any Perl scripts that you 2737 write are your property, including scripts that make use of Berkeley 2738 DB. Neither the Perl license nor the Berkeley DB license 2739 place any restriction on what you may do with them. 2740 2741If you are in any doubt about the license situation, contact either the 2742Berkeley DB authors or the author of BerkeleyDB. 2743See L<"AUTHOR"> for details. 2744 2745 2746=head1 AUTHOR 2747 2748Paul Marquess E<lt>pmqs@cpan.orgE<gt>. 2749 2750 2751=head1 SEE ALSO 2752 2753perl(1), DB_File, Berkeley DB. 2754 2755=cut 2756