Architecture - BangDB, Embedded

Bangdb has been design from scratch keeping following items as the main design goals;

  • Performance - fast key value store, highly concurrent
  • Robust - crash resistant, fault tolerant, auto recovery
  • Flavors - should have various configurable aspects
  • Pluggable - standard API, can be plugged un-plugged easily
  • No Admin - easy install, uninstall. Self monitored
  • Economy - runs on commodity hardware

The general high level architectural information is given below;

The bangdb provides simple standard APIs for clients to access the database. Please see the API section for detail. When enabled, bangdb creates a buffer pool of size given by the user and then creates many data structures to mange the buffer. It creates a hash table of buffer headers, an lru list, a dirty page list and a free header list. It also creates workers to handle the various housekeeping for buffer pool and also for flushing dirty pages to disk asynchronously

The bangdb takes decision at regular interval depending upon the pressure on the memory and requirement to decide on how much buffer should be freed, how many pages to be flushed etc. The lru list helps in deciding which headers to be flushed before others

Landscape for BangDB - where it stands

Logically the bangdb consists of mainly following components as in simple view given below;

With BangDB user can create multiple tables and for each table different set of configuration can be defined. For example one table can be created for Hash based index, log off, key size as 10 bytes whereas other table can be created with Btree based index, log on and key size as 16 bytes

The connection object is used to interact with the table. User can't interact with the table directly, they will have to get the connection to it. This enables the not only the flexibility of using different configurations for different table but also the ease of creating multiple connections for the same table with very tiny overhead, pass around the connection object to different threads as connection is totally safe in concurrent environment

Other object which is missing in the picture is the transaction. Transaction is implemented as OCC, please see here for more details

The log is basically implemented as write ahead logging and this can be switched on/off by the user as it may be required. However, for transaction, log has to be enabled otherwise db will quit with appropriate message. With log db can also recover from crash and bring the db to consistent state automatically. Please see here for more information on log

BangDB implements its own buffer pool. Having own buffer pool allows db to manage the pages efficiently and in local contextual manner rather than leaving this job for OS which will not be able to apply several local db factors in decision making while reading and writing pages to disk. BangDB implements adaptive sequential page prefetching and flushing to reduce disk seeks and improve IO performance. Please see here for more infor on buffer pool

This just covers the basic high level design and architecture of BangDB engine. Please let us know if you may require more info on any of the items discussed here

The design of BangDB Server, Data Fabric etc will be covered separately