VSDB is an experimental database based on atomic updates of constant databases. Unlike other lightweight databases, VSDB supports full transactional semantics when reading and writing to the database. A VSDB database consists of several hash tables (with plans for other data structures). Transaction conflicts are detected and rolled back and the system maintains full ACID semantics. VSDB may be used across distributed filesystems such as NFS without problems.
The interesting this about VSDB is that it does this with no locking whatsoever, not even file locks. This means that the system is very robust against errors and portable across filesystems and even across different computers accessing the database via NFS or other shared filesystems. Any app accessing the database may die uncleanly or become disconnected from the file store at any point and the database cannot become corrupted. Moreso, a client may always trust the return code of a transaction commit to know definitively whether an update succeeded or failed. VSDB will even work over NFS and distributed filesystems which support its atomicity requirements (which are easier to meet than what UNIX generally provides) without requiring problematic locking daemons or services.
The price paid for this ability is more expensive updates of the database, to offset this, since we are building constant databases they are usually smaller than mutable ones containing only the raw data and minimal hash information. This also allows optimizations which are not possible in mutable databases which can be exploited for faster reading than mutable databases. There are several areas where this set of trade offs would be appropriate where traditionally other embedded databases such as Berkeley DB and GDBM have been used, but the added robustness and true guaranteed ACID transaction updates or the maximal speed when reading the database are needed.
file format
|"JWM\001" |generation | number of entries |raw record/key data| hash entries | |--- 4 ----|--- 4 ---- | -------- 4 -------| ...variable... | ---20*num ---- |each hash entry looks like this and is 20 bytes long.
struct vsdb_hashentry { uint32_t hash; //hash of key uint32_t offset; //offset (in bytes) of key/data pair in file uint32_t keysize; uint32_t datasize; uint32_t overflow; //index in hash table for hash collisions };to look up a key, hash it. and take hash % num_entries as an index into the hash table. check if that entry matches. if it doesn't then 'overflow' may contain a link to an overflow spot in the same table. keep following the overflow links until you run out of them or find the entry wanted.
__[unique ID].tmp
- these are potential new databases
created by processes wishing to update the database
dbd.[number]/
- this is the one and only subdirectory
in which the database files themselves live. the number should be the
current database's generation number. Although its name changes, there
is only a single subdirectory in the database root.
dbd.[number]/db.[number]
- these are the actual
database files. the current state of the database is always in the
highest numbered (based on a sliding window) file.
db.cache
- this is a hard link to what is thought to
be the most recent version of the database. It is used to increase
performance and so it is okay for it to be out of date occasionally.
dbd.[num]/db.[num + 1]
where num is the generation number of the database you are updating.See the API documentation. for the current interface. This is very alpha software at the moment, beware.
the Changelog has information on recent changes.