GNUdbm is a library of routines that manages data files that contain key/data pairs. The access
provided is that of storing, retrieval, and deletion by key and a non-sorted traversal of all keys. A
process is allowed to use multiple data files at the same time.
Openingadatabase
A process that opens a gdbm file is designated as a "reader" or a "writer". Only one writer may open a
gdbm file and many readers may open the file. Readers and writers can not open the gdbm file at the same
time. The procedure for opening a gdbm file is:
GDBM_FILEgdbm_open(constchar*name,intblock_size,intflags,intmode,void(*fatal_func)(constchar*));Name is the name of the file (the complete name, gdbm does not append any characters to this name).
Block_size is the size of a single transfer from disk to memory. If the value is less than 512, the file
system block size is used instead. The size is adjusted so that the block can hold exact number of
directory entries, so that the effective block size can be slightly greater than requested. This
adjustment is disabled if the GDBM_BSEXACTflag is used.
The flags parameter is a bitmask, composed of the accessmode and one or more modifier flags. The accessmode bit designates the process as a reader or writer and must be one of the following:
GDBM_READER
reader
GDBM_WRITER
writer
GDBM_WRCREAT
writer - if database does not exist create new one
GDBM_NEWDB
writer - create new database regardless if one exists
Additional flags (modifiers) can be combined with these values by bitwise OR. Not all of them are
meaningful with all access modes.
Flags that are valid for any value of access mode are:
GDBM_CLOEXEC
Set the close-on-exec flag on the database file descriptor.
GDBM_NOLOCK
Prevents the library from performing any locking on the database file.
GDBM_NOMMAP
Instructs gdbm_open to disable the use of mmap(2).
GDBM_PREREAD
When mapping GDBM file to memory, read its contents immediately, instead of when needed (prefaultreading). This can be advantageous if you open a read-only database and are going to do a lot of
look-ups on it. In this case entire database will be read at once and searches will operate on an
in-memory copy. In contrast, GDBM_PREREAD should not be used if you open a database (even in
read-only mode) only to retrieve a couple of keys.
Finally, never use GDBM_PREREAD when opening a database for updates, especially for inserts: this
will degrade performance.
This flag has no effect if GDBM_NOMMAP is given, or if the operating system does not support
prefault reading. It is known to work on Linux and FreeBSD kernels.
GDBM_XVERIFY
Enable additional consistency checks. With this flag, eventual corruptions of the database are
discovered when opening it, instead of when a corrupted structure is read during normal operation.
However, on large databases, it can slow down the opening process.
The following additional flags are valid when the database is opened for writing (GDBM_WRITER,
GDBM_WRCREAT, or GDBM_NEWDB):
GDBM_SYNC
Causes all database operations to be synchronized to the disk.
NOTE: this option entails severe performance degradation and does not necessarily ensure that the
resulting database state is consistent, therefore we discourage its use. For a discussion of how
to ensure database consistency with minimal performance overhead, see CRASHTOLERANCE below.
GDBM_FAST
A reverse of GDBM_SYNC: synchronize writes only when needed. This is the default. This flag is
provided only for compatibility with previous versions of GDBM.
The following flags can be used together with GDBM_NEWDB. They also take effect when used with
GDBM_WRCREAT, if the requested database file doesn't exist:
GDBM_BSEXACT
If this flag is set and the requested block_size value cannot be used, gdbm_open will refuse to
create the database. In this case it will set the gdbm_errno variable to GDBM_BLOCK_SIZE_ERROR
and return NULL.
Without this flag, gdbm_open will silently adjust the block_size to a usable value, as described
above.
GDBM_NUMSYNC
Create new database in extendeddatabaseformat, a format best suited for effective crash
recovery. For a detailed discussion, see the CRASHRECOVERY chapter below.
Mode is the file mode (see chmod(2) and open(2)). It is used if the file is created.
Fatal_func is a function to be called when gdbm if it encounters a fatal error. This parameter is
deprecated and must always be NULL.
The return value is the pointer needed by all other routines to access that gdbm file. If the return is
the NULL pointer, gdbm_open was not successful. In this case, the reason of the failure can be found in
the gdbm_errno variable. If the following call returns true (non-zero value):
gdbm_check_syserr(gdbm_open)
the system errno variable must be examined in order to obtain more detail about the failure.
GDBM_FILEgdbm_fd_open(intFD,constchar*name,intblock_size,intflags,intmode,void(*fatal_func)(constchar*));
This is an alternative entry point to gdbm_open. FD is a valid file descriptor obtained as a result of a
call to open(2) or creat(2). The function opens (or creates) a DBM database this descriptor refers to.
The descriptor is not dup'ed, and will be closed when the returned GDBM_FILE is closed. Use dup(2) if
that is not desirable.
In case of error, the function behaves like gdbm_open and doesnotcloseFD. This can be altered by the
following value passed in flags:
GDBM_CLOERROR
Close FD before exiting on error.
The rest of arguments are the same as for gdbm_open.
Callingconvention
All GDBM functions take as their first parameter the databasehandle (GDBM_FILE), returned from gdbm_open
or gdbm_fd_open.
Any value stored in the GDBM database is described by datum, an aggregate type defined as:
typedef struct
{
char *dptr;
int dsize;
} datum;
The dptr field points to the actual data. Its type is char* for historical reasons. Actually it should
have been typed void*. Programmers are free to store data of arbitrary complexity, both scalar and
aggregate, in this field.
The dsize field contains the number of bytes stored in dptr.
The datum type is used to describe both keys and content (values) in the database. Values of this type
can be passed as arguments or returned from GDBM function calls.
GDBM functions that return datum indicate failure by setting its dptr field to NULL.
Functions returning integer value, indicate success by returning 0 and failure by returning a non-zero
value (the only exception to this rule is gdbm_exists, for which the return value is reversed).
If the returned value indicates failure, the gdbm_errno variable contains an integer value indicating
what went wrong. A similar value is associated with the dbf handle and can be accessed using the
gdbm_last_errno function. Immediately after return from a function, both values are exactly equal.
Subsequent GDBM calls with another dbf as argument may alter the value of the global gdbm_errno, but the
value returned by gdbm_last_errno will always indicate the most recent code of an error that occurred for
thatparticulardatabase. Programmers are encouraged to use such per-database error codes.
Sometimes the actual reason of the failure can be clarified by examining the system errno value. To make
sure its value is meaningful for a given GDBM error code, use the gdbm_check_syserr function. The
function takes error code as argument and returns 1 if the errno is meaningful for that error, or 0 if it
is irrelevant.
Similarly to gdbm_errno, the latest errno value associated with a particular database can be obtained
using the gdbm_last_syserr function.
The gdbm_clear_error clears the error indicator (both GDBM and system error codes) associated with a
database handle.
Some critical errors leave the database in a structurallyinconsistentstate. If that happens, all
subsequent GDBM calls accessing that database will fail with the GDBM error code of GDBM_NEED_RECOVERY (a
special function gdbm_needs_recovery is also provided, which returns true if the database handle given as
its argument is structurally inconsistent). To return such databases to consistent state, use the
gdbm_recover function (see below).
The GDBM_NEED_RECOVERY error cannot be cleared using gdbm_clear_error.
Errorfunctions
This section describes the error handling functions outlined above.
gdbm_errorgdbm_last_errno(GDBM_FILEdbf)
Returns the error code of the most recent failure encountered when operating on dbf.
intgdbm_last_syserr(GDBM_FILEdbf)
Returns the value of the system errno variable associated with the most recent failure that
occurred on dbf.
Notice that not all gdbm_error codes have a relevant system error code. Use the following
function to determine if a given code has.
intgdbm_check_syserr(gdbm_errorerr)
Returns 1, if system errno value should be checked to get more info on the error described by GDBM
code err.
voidgdbm_clear_error(GDBM_FILEdbf)
Clears the error state for the database dbf. This function is called implicitly upon entry to any
GDBM function that operates on GDBM_FILE.
The GDBM_NEED_RECOVERY error cannot be cleared.
intgdbm_needs_recovery(GDBM_FILEdbf)
Return 1 if the database file dbf is in inconsistent state and needs recovery.
constchar*gdbm_strerror(gdbm_errorerr)
Returns a textual description of the error code err.
constchar*gdbm_db_strerror(GDBM_FILEdbf)
Returns a textual description of the recent error in database dbf. This description includes the
system errno value, if relevant.
Closingthedatabase
It is important that every database file opened is also closed. This is needed to update the
reader/writer count on the file. This is done by:
intgdbm_close(GDBM_FILEdbf);Databaselookupsintgdbm_exists(GDBM_FILEdbf,datumkey);
If the key is found within the database, the return value will be true (1). If nothing
appropriate is found, false (0) is returned and gdbm_errno set to GDBM_NO_ERROR.
On error, returns 0 and sets gdbm_errno.
datumgdbm_fetch(GDBM_FILEdbf,datumkey);Dbf is the pointer returned by gdbm_open. Key is the key data.
If the dptr element of the return value is NULL, the gdbm_errno variable should be examined. The
value of GDBM_ITEM_NOT_FOUND means no data was found for that key. Other value means an error
occurred.
Otherwise the return value is a pointer to the found data. The storage space for the dptr element
is allocated using malloc(3). GDBM does not automatically free this data. It is the programmer's
responsibility to free this storage when it is no longer needed.
Iteratingoverthedatabase
The following two routines allow for iterating over all items in the database. Such iteration is not key
sequential, but it is guaranteed to visit every key in the database exactly once. (The order has to do
with the hash values.)
datumgdbm_firstkey(GDBM_FILEdbf);
Returns first key in the database.
datumgdbm_nextkey(GDBM_FILEdbf,datumkey);
Given a key, returns the database key that follows it. End of iteration is marked by returning
datum with dptr field set to NULL and setting the gdbm_errno value to GDBM_ITEM_NOT_FOUND.
After successful return from both functions, dptr points to data allocated by malloc(3). It is the
caller responsibility to free the data when no longer needed.
A typical iteration loop looks like:
datum key, nextkey, content;
key = gdbm_firstkey (dbf);
while (key.dptr)
{
content = gdbm_fetch (dbf, key);
/* Do something with key and/or content */
nextkey = gdbm_nextkey (dbf, key);
free (key.dptr);
key = nextkey;
}
These functions are intended to visit the database in read-only algorithms. Avoid any database
modifications within the iteration loop. File visiting is based on a hash table. The gdbm_delete and,
in most cases, gdbm_store, functions rearrange the hash table to make sure that any collisions in the
table do not leave some item `un-findable'. Thus, a call to either of these functions changes the order
in which the keys are ordered. Therefore, these functions should not be used when iterating over all the
keys in the database. For example, the following loop is wrong: it is possible that some keys will not
be visited or will be visited twice if it is executed:
key = gdbm_firstkey (dbf);
while (key.dptr)
{
nextkey = gdbm_nextkey (dbf, key);
if (some condition)
gdbm_delete ( dbf, key );
free (key.dptr);
key = nextkey;
}
Updatingthedatabaseintgdbm_store(GDBM_FILEdbf,datumkey,datumcontent,intflag);Dbf is the pointer returned by gdbm_open. Key is the key data. Content is the data to be
associated with the key. Flag can have one of the following values:
GDBM_INSERT
Insert only, generate an error if key exists;
GDBM_REPLACE
Replace contents if key exists.
The function returns 0 on success and -1 on failure. If the key already exists in the database
and the flag is GDBM_INSERT, the function does not modify the database. It sets gdbm_errno to
GDBM_CANNOT_REPLACE and returns 1.
intgdbm_delete(GDBM_FILEdbf,datumkey);
Looks up and deletes the given key from the database dbf.
The return value is 0 if there was a successful delete or -1 on error. In the latter case, the
gdbm_errno value GDBM_ITEM_NOT_FOUND indicates that the key is not present in the database. Other
gdbm_errno values indicate failure.
Recoveringstructuralconsistency
If a function leaves the database in structurally inconsistent state, it can be recovered using the
gdbm_recover function.
intgdbm_recover(GDBM_FILEdbf,gdbm_recovery*rcvr,intflags)
Check the database file DBF and fix eventual inconsistencies. The rcvr argument can be used both
to control the recovery and to return additional statistics about the process, as indicated by
flags. For a detailed discussion of these arguments and their usage, see the GDBMManual, chapter
Recoveringstructuralconsistency.
You can pass NULL as rcvr and 0 as flags, if no such control is needed.
By default, this function first checks the database for inconsistencies and attempts recovery only
if some were found. The special flags bit GDBM_RCVR_FORCE instructs gdbm_recovery to skip this
check and to perform database recovery unconditionally.
ExportandimportGDBM database files can be exported (dumped) to so called flatfiles or imported (loaded) from them. A
flat file contains exactly the same data as the original database, but it cannot be used for searches or
updates. Its purpose is to keep the data from the database for restoring it when the need arrives. As
such, flat files are used for backup purposes, and for sending databases over the wire.
As of GDBM version 1.21, there are two flat file formats. The ASCII file format encodes all data in
Base64 and stores not only key/data pairs, but also the original database file metadata, such as file
name, mode and ownership. Files in this format can be sent without additional encapsulation over
transmission channels that normally allow only ASCII data, such as, e.g. SMTP. Due to additional
metadata they allow for restoring an exact copy of the database, including file ownership and privileges,
which is especially important if the database in question contained some security-related data. This is
the preferred format.
Another flat file format is the binary format. It stores only key/data pairs and does not keep
information about the database file itself. It cannot be used to copy databases between different
architectures. The binary format was introduced in GDBM version 1.9.1 and is retained mainly for
backward compatibility.
The following functions are used to export or import GDBM database files.
intgdbm_dump(GDBM_FILEdbf,constchar*filename,intformat,intopen_flag,intmode)
Dumps the database file dbf to the file filename in requested format. Allowed values for format
are: GDBM_DUMP_FMT_ASCII, to create an ASCII dump file, and GDBM_DUMP_FMT_BINARY, to create a
binary dump.
The value of open_flag tells gdbm_dump what to do if filename already exists. If it is
GDBM_NEWDB, the function will create a new output file, replacing it if it already exists. If its
value is GDBM_WRCREAT, the file will be created if it does not exist. If it does exist, gdbm_dump
will return error.
The file mode to use when creating the output file is defined by the mode parameter. Its meaning
is the same as for open(2).
intgdbm_load(GDBM_FILE*pdbf,constchar*filename,intflag,intmeta_mask,unsignedlong*errline)
Loads data from the dump file filename into the database pointed to by pdbf. If pdbf is NULL, the
function will try to create a new database. On success, the new GDBM_FILE object will be stored
in the memory location pointed to by pdbf. If the dump file carries no information about the
original database file name, the function will set gdbm_errno to GDBM_NO_DBNAME and return -1,
indicating failure.
Otherwise, if pdbf points to an already open GDBM_FILE, the function will load data from filename
into that database.
The flag parameter controls the function behavior if a key from the dump file already exists in
the database. See the gdbm_store function for its possible values.
The meta_mask parameter can be used to disable restoring certain bits of file's meta-data from the
information in the input dump file. It is a binary OR of zero or more of the following:
GDBM_META_MASK_MODE
Do not restore file mode.
GDBM_META_MASK_OWNER
Do not restore file owner.
Otherfunctionsintgdbm_reorganize(GDBM_FILEdbf);
If you have had a lot of deletions and would like to shrink the space used by the GDBM file, this
routine will reorganize the database.
intgdbm_sync(GDBM_FILEdbf);
Synchronizes the changes in dbf with its disk file.
It will not return until the disk file state is synchronized with the in-memory state of the
database.
intgdbm_setopt(GDBM_FILEdbf,intoption,void*value,intsize);
Query or change some parameter of an already opened database. The option argument defines what
parameter to set or retrieve. If the set operation is requested, value points to the new value.
Its actual data type depends on option. If the get operation is requested, value points to a
memory region where to store the return value. In both cases, size contains the actual size of
the memory pointed to by value.
Possible values of option are:
GDBM_SETCACHESIZEGDBM_CACHESIZE
Set the size of the internal bucket cache. The value should point to a size_t holding the
desired cache size, or the constant GDBM_CACHE_AUTO, to select the best cache size
automatically.
By default, a newly open database is configured to adapt the cache size to the number of index
buckets in the database file. This provides for the best performance.
Use this option if you wish to limit the memory usage at the expense of performance. If you
chose to do so, please bear in mind that cache becomes effective when its size is greater then
2/3 of the number of index bucket counts in the database. The best performance results are
achieved when cache size equals the number of buckets.
GDBM_GETCACHESIZE
Return the size of the internal bucket cache. The value should point to a size_t variable,
where the size will be stored.
GDBM_GETFLAGS
Return the flags describing current state of the database. The value should point to an int
variable where to store the flags. On success, its value will be similar to the flags used
when opening the database, except that it will reflect the current state (which may have been
altered by another calls to gdbm_setopt).
GDBM_FASTMODE
Enable or disable the fastwritesmode, similar to the GDBM_FAST option to gdbm_open.
This option is retained for compatibility with previous versions of GDBM.
GDBM_SETSYNCMODEGDBM_SYNCMODE
Turn on or off immediate disk synchronization after updates. The value should point to an
integer: 1 to turn synchronization on, and 0 to turn it off.
NOTE: setting this option entails severe performance degradation and does not necessarily
ensure that the resulting database state is consistent, therefore we discourage its use. For
a discussion of how to ensure database consistency with minimal performance overhead, see
CRASHTOLERANCE below.
GDBM_GETSYNCMODE
Return the current synchronization status. The value should point to an int where the status
will be stored.
GDBM_SETCENTFREEGDBM_CENTFREE
Enable or disable central free block pool. The default is off, which is how previous versions
of GDBM handled free blocks. If set, this option causes all subsequent free blocks to be
placed in the global pool, allowing (in theory) more file space to be reused more quickly.
The value should point to an integer: TRUE to turn central block pool on, and FALSE to turn it
off.
The GDBM_CENTFREE alias is provided for compatibility with earlier versions.
GDBM_SETCOALESCEBLKSGDBM_COALESCEBLKS
Set free block merging to either on or off. The default is off, which is how previous
versions of GDBM handled free blocks. If set, this option causes adjacent free blocks to be
merged. This can become a CPU expensive process with time, though, especially if used in
conjunction with GDBM_CENTFREE. The value should point to an integer: TRUE to turn free block
merging on, and FALSE to turn it off.
GDBM_GETCOALESCEBLKS
Return the current status of free block merging. The value should point to an int where the
status will be stored.
GDBM_SETMAXMAPSIZE
Sets maximum size of a memory mapped region. The value should point to a value of type
size_t, unsignedlong or unsigned. The actual value is rounded to the nearest page boundary
(the page size is obtained from sysconf(_SC_PAGESIZE)).
GDBM_GETMAXMAPSIZE
Return the maximum size of a memory mapped region. The value should point to a value of type
size_t where to return the data.
GDBM_SETMMAP
Enable or disable memory mapping mode. The value should point to an integer: TRUE to enable
memory mapping or FALSE to disable it.
GDBM_GETMMAP
Check whether memory mapping is enabled. The value should point to an integer where to return
the status.
GDBM_GETDBNAME
Return the name of the database disk file. The value should point to a variable of type
char**. A pointer to the newly allocated copy of the file name will be placed there. The
caller is responsible for freeing this memory when no longer needed.
GDBM_GETBLOCKSIZE
Return the block size in bytes. The value should point to int.
intgdbm_fdesc(GDBM_FILEdbf);
Returns the file descriptor of the database dbf.