xref: /original-bsd/lib/libc/db/man/dbopen.3 (revision 6ab384a1)
Copyright (c) 1990 The Regents of the University of California.
All rights reserved.

%sccs.include.redist.man%

@(#)dbopen.3 5.1 (Berkeley) 08/27/90

DB 3 ""
C 7
NAME
btree_open, flat_open, hash_open - database manipulation routines
SYNOPSIS
#include <db.h>

DB *
btree_open(const char *file, int flags, int mode, const BTREEINFO * private);

DB *
flat_open(const char *file, int flags, int mode, const FLATINFO * private);

DB *
hash_open(const char *file, int flags, int mode, const HASHINFO * private);
DESCRIPTION
Btree_open , flat_open , and hash_open are interfaces, respectively, to database files in btree, flat, and hashed record formats. Access to all file types is based on key/data pairs, where both keys and data are of essentially unlimited size.

Each routine opens file for reading and/or writing. The flags and mode arguments are as specified to the open (2) routine, however only the O_CREAT, O_EXCL, O_RDONLY, O_RDWR, O_TRUNC and O_WRONLY flags are meaningful. Databases which are temporary, i.e. not intended to be preserved on disk, may be created by setting the file parameter to NULL. The argument private is a pointer to a private, access-method specific structure described below.

The open routines return a pointer to a structure representing the database on success and NULL on error. This structure is as follows: typedef struct {

void *internal;

int (*close)(), (*delete)(), (*get)(), (*put)(), (*seq)(), (*sync)();

} DB;

This structure is as follows:

internal A pointer to an internal structure private to the access method.

close A pointer to a routine to flush any cached information to disk, free any allocated resources, and close the database file, whose function prototype is: close(const DB *db); Since key/data pairs may be cached in memory, failing to close the file with the close routine may result in inconsistent or lost information. The close routine returns 0 on error and 1 on success.

delete A pointer to a routine to remove key/data pairs from the database, whose function prototype is: delete(const DB *db, const VALUE *key); The delete routine returns 0 on error, 1 on success, and -1 if the specified key was not in the file.

get A pointer to a routine which is the interface for keyed retrieval from the database, whose function prototype is: get(const DB *db, const VALUE *key, VALUE *data); The address and length of the data associated with the specified key are returned in the structure referenced by data . The get routine returns 0 on error, 1 on success, and -1 if the key was not in the file.

put A pointer to a routine to store key/data pairs in the database, whose function prototype is: put(const DB *db, const VALUE *key, const VALUE *data, u_long flag); The parameter flag, if set, should be either R_APPEND or R_INSERT, optionally or 'ed with R_NOOVERWRITE.

R_APPEND Append the data immediately after the data referenced by key , creating a new record. (This implies that the access method is able to create new keys itself, i.e. the keys are ordered and independent, for example, record numbers. Currently applicable only to the flat file access method.)

R_INSERT Insert the data immediately before the data referenced by key , creating a new record. (This implies that the access method is able to create new keys itself, i.e. the keys are ordered and independent, for example, record numbers. Currently applicable only to the flat file access method.)

R_NOOVERWRITE Enter the new key/data pair only if the key does not previously exist.

The put routine returns 0 on error, 1 on success, and -1 if the R_NOOVERWRITE flag is set and the key already exists in the file.

seq A pointer to a routine which is the interface for sequential retrieval from the database, whose function prototype is: seq(const DB *db, VALUE *key, VALUE *data, int flag); The address and length of the key are returned in the structure referenced by key , and the address and length of the data are returned in the structure referenced by data .

The flag value, if set, should be one of the following values:

R_FIRST The first key of the hash table is returned.

R_LAST The last key of the hash table is returned.

R_NEXT Retrieve the record immediately after the most recently requested record.

R_PREV Retrieve the record immediately before the most recently requested record.

The first time the seq routine is called, the first record of the database is returned if flag is not set or is set to R_FIRST or R_NEXT.

The seq routine returns 0 on error, 1 on success, -1 if end-of-file is reached, and -2 if the input is a character device and no complete records are available.

sync A pointer to a routine to flush any cached information to disk, whose function prototype is: sync(const DB *db); If the database is in memory only, the sync routine is a no-op. The sync routine returns 0 on error and 1 on success.

Each of the routines take a pointer to a structure as returned by the open routine, one or more pointers to key/data structures, and, optionally, a flag value.

Keys (and data) are represented by the following data structure: typedef struct {

u_char *data;

size_t size;

} ENTRY;

The elements of this structure are as follows:

data A pointer to a byte string.

size The length of the byte string.

BTREE
One of the access methods is a btree: a sorted, balanced tree structure with associated key and data pairs.

<Mike fill this in?>

The private data structure provided to btree_open is as follows: typedef struct {

u_long flags;

int cachesize;

int pagesize;

} BTREEINFO;

The elements of this structure are as follows:

flags The flag value is specified by or 'ing the following values:

R_SMALLCACHE A flag informing the routines that they are not expected to be the primary data cache, and to minimize any caching they do.

cachesize

pagesize

HASH
One of the access methods is hashed access and storage. The private data structure provided to hash_open is as follows: typedef struct {

u_long flags;

int bsize;

int ffactor;

int nelem;

u_long (*hash)(const void *, const size_t);

} HASHINFO;

The elements of this structure are as follows:

flags The flag value is specified by or 'ing the following values:

R_SMALLCACHE A flag informing the routines that they are not expected to be the primary cache, and to minimize any caching they do.

bsize Bsize defines the hash table bucket size, and is, by default 1024, bytes. For tables with large data items, it may be preferable to increase the page size, and, conversely, applications doing exclusively in-memory hashing may want to use a very small bucket size, for example, 256, to minimize hash chain collisions.

ffactor Ffactor indicates a desired density within the hash table. It is an approximation of the number of keys allowed to accumulate in any one bucket, determining when the hash table grows or shrinks. The default value is 5.

hash Hash is a user defined hash function. Since no hash function performs equally well on all possible data, the user may find that the built-in hash function does poorly on a particular data set. Any user specified hash function should take two arguments, a pointer to a byte string and a length, and return an unsigned long to be used as the hash value.

nelem Nelem is an estimate of the final size of the hash table. If not set, the default value is 1. If not set or set too low, hash tables will expand gracefully as keys are entered, although a slight performance degradation may be noticed.

If the hash table already exists, and the O_TRUNC flag is not specified to hash_open , the parameters bsize , ffactor , and nelem are ignored.

If a hash function is specified, hash_open will attempt to determine if the hash function specified is the same as the one with which the database was created, and will fail if it is not.

Backward compatible interfaces to the routines described in dbm (3), hsearch (3), and ndbm (3) are provided as part of the compatibility library, ``libcompat.a''.

"FLAT FILES"
One of the access methods is either variable or fixed-length records, the former delimited by a specific byte value. The private data structure provided to flat_open is as follows: typedef struct {

u_long flags;

int cachesize;

size_t reclen;

u_char bval;

} VLENINFO;

The elements of this structure are as follows:

flags The flag value is specified by or 'ing the following values:

R_FIXEDLEN The records are fixed-length, not byte delimited. The structure element reclen specifies the length of the record, and the structure element bval is used as the pad character.

R_SMALLCACHE A flag informing the routines that they are not expected to be the primary cache, and to minimize any caching they do.

cachesize The amount of memory to be used as a data cache.

reclen The length of a fixed-length record.

bval The delimiting byte to be used to mark the end of a record for variable-length records, and the pad character for fixed-length records.

Variable-length and fixed-length data files require key structures to reference a byte followed by three unsigned longs. The numbers are used as a record number, a byte offset and a record length, respectively, and the byte is a flag value which indicates the validity of the other fields. These access methods do no validity checking as to the correctness of any of these values, nor are they constrained to use the values provided. If any of the record number, byte offset or record length are not specified by the calling routine, and the record retrieval is successful, the correct values are copied into the caller's key structure. The flag value is specified by or 'ing the following values:

R_LENGTH The record length is valid.

R_OFFSET The byte offset is valid.

R_RECNO The record number is valid.

ERRORS
The open routines may fail and set errno for any of the errors specified for the library routines open (2) and malloc (3) or the following:

[EINVAL] A parameter has been specified (hash function, pad byte etc.) that is incompatible with the current file specification or there is a mismatch between the version number of file and the software.

The get routines may fail and set errno for any of the errors specified for the library routine malloc (3).

The close routines may fail and set errno for any of the errors specified for the library routines close (2), free (3), or fsync (2).

The sync routines may fail and set errno for any of the errors specified for the library routine fsync (2).