Q-Gdbm - GNU DBM interface for the Q programming language ====== = === === ========= === === = =========== ======== For this module to work, you must have the GNU dbm library (libgdbm) on your system. If you have a Linux system then you most likely have this library, otherwise you can get it from http://www.gnu.org or one of its mirrors. This module is also supported on Windows. The gdbm library provides some simple database functions, suitable for storing indexed data in a file. The interface provided by this module is a straight- forward wrapper for (most of) the C functions provided by the library; see gdbm(3) for more information. Database files are represented using an external type `GdbmFile'. Both keys and data are represented using clib's byte strings (see clib.q), thus arbitary binary data can be stored in a database. In the following we give a brief description of the most important operations. VERSION INFORMATION AND ERROR CODES ======= =========== === ===== ===== The version of the host gdbm library can be retrieved with the `gdbm_version' function: ==> gdbm_version "This is GDBM version 1.8.0, as of May 19, 1999." When any of the other gdbm operations fail you can usually retrieve an error code with the `gdbm_errno' function. It is also possible to set the value of this variable with `gdbm_seterrno' and determine a readable message for an error code with `gdbm_strerror'. ==> gdbm_strerror gdbm_errno OPENING AND CLOSING DATABASES ======= === ======= ========= Before you can work with a database you must open it with the `gdbm_open' function, which takes the name, block size, read/write flags and a file creation mode as its arguments; the latter is only used when a new database file is created. The function returns a `GdbmFile' object which is used in subsequent operations on the database. For instance, here is how we can open a database for both reading and writing, creating it if it does not yet exist: ==> def DBF = gdbm_open "testdb" 512 GDBM_WRCREAT 0666 The supported flag values are GDBM_READER (readonly access), GDBM_WRITER (read/write access), GDBM_WRCREAT (read/write, create if necessary), and GDBM_NEWDB (read/write, always create a new database). These can be combined (i.e., or'ed bitwise) with the following values: GDBM_FAST (obsolete, since gdbm now always opens databases in "fast" mode), GDBM_SYNC ("slow" mode, immediately commit changes to the disk file), and GDBM_NOLOCK (don't perform locking on the database). Unless the GDBM_NOLOCK flag is specified, gdbm uses file locking on the database to guarantee exclusive access to writers. The database can also be opened simultaneously by any number of readers, provided that no writer is currently accessing the database. When using the GDBM_NOLOCK flag the application is responsible to perform its own locking. To these ends, the `gdbm_fdesc' function can be used to determine the file descriptor associated with a database file: ==> gdbm_fdesc DBF A database file is closed automatically when the corresponding GdbmFile object is garbage-collected. You can also close it explicitly by invoking the `gdbm_close' function: ==> gdbm_close DBF After closing the database file in this manner, all subsequent operations on the `GdbmFile' object will fail. STORING AND RETRIEVING DATA ======= === ========== ==== A new item is stored in the database using the `gdbm_store' function which takes as arguments the `GdbmFile' object, a key, a value, and a flag indicating whether an existing item with the same key is to be replaced. Both the key and the associated value must be encoded as clib byte strings. For instance, we can store a binary integer value under the string key "foo" as follows. (The GDBM_REPLACE flag indicates that any existing item under the "foo" key is to be replaced. When using the GDBM_INSERT flag instead, only a new key can be inserted.) ==> gdbm_store DBF (bytestr "foo") (bytestr 99) GDBM_REPLACE You can also store an arbitrary (printable) Q expression by first converting it to a string. ==> gdbm_store DBF (bytestr "bar") (bytestr (str [1,2,3])) GDBM_REPLACE The `gdbm_fetch' function is used to retrieve the value associated with a key in the database: ==> gdbm_fetch DBF (bytestr "foo") <> Note that the result is again encoded as a byte string which must be converted back to the desired data type: ==> bint _ 99 ==> val (bstr (gdbm_fetch DBF (bytestr "bar"))) [1,2,3] You can check beforehand whether a key exists in the database with the `gdbm_exists' function: ==> gdbm_exists DBF (bytestr "foo") true You can also traverse all keys in the database with the `gdbm_firstkey' and `gdbm_nextkey' functions. E.g.: ==> map bstr (while isbytestr (gdbm_nextkey DBF) (gdbm_firstkey DBF)) ["foo","bar"] The keys will be listed in an apparently random order (i.e., not sorted by the keys). To delete a key from the database we employ the `gdbm_delete' function: ==> gdbm_delete DBF (bytestr "foo") () ==> gdbm_exists DBF (bytestr "foo") false MAINTENANCE FUNCTIONS =========== ========= By default, gdbm databases are opened in "fast" mode which means that data will be buffered and hence might not be written to the disk file immediately. The `gdbm_sync' function can be used to forcibly commit all changes to the disk file: ==> gdbm_sync DBF For efficiency, a gdbm database never shrinks when deleting items; instead, unused entries will be reused when new items are inserted. Hence much space may be wasted in the database file after deleting a large number of items. You can use the `gdbm_reorganize' function to shrink the database in such cases. This will rebuild the database from scratch and hence is a costly operation which should be used sparingly. ==> gdbm_reorganize DBF EXAMPLES ======== A sample script illustrating most of the functions discussed above can be found in the `testdb.q' file. Moreover, the `gdbm_dict.q' script shows how a dictionary-like interface can be implemented on top of the gdbm module. Enjoy! :) May 6 2003 Albert Graef ag@muwiinfa.geschichte.uni-mainz.de, Dr.Graef@t-online.de http://www.musikwissenschaft.uni-mainz.de/~ag