Advanced Configuration and Explanation - Bangdb Embedded

The advanced configuration parameters which are important for ensuring that BangDB runs in most efficient manner exploiting the individual use case scenario and the machine configuration. To see the complete list please visit the bangdb configuration page and the most common configurations are listed at

.

Here are the list of advanced config pramaetrs for BangDB Embedded. User should be careful in making the changes but should definitely consider tweeking the values freely based on the suggestions given below;

  1. page size

    (PAGE_SIZE_BANGDB): Default page size used by db is 8192 bytes. But user can change it to any size that he/she thinks fit. Note that having too high or too low page size has it's own pros and cons, but maintaining a balance is important for getting optimal performance. The various factors that could lead to determination of page size could be, (a) avg size of keys (b) avg size of data (c) single threaded vs multi-threaded etc...
  2. log flush frequency:

    [LOG_FLUSH_FREQ] If log in ON then this means how frequently log will be flashed to disk as a background activity. The number provided here is in micro sec. The default value is 50000 (50 ms). This value is quite agressive as at max any new log will reside in-memory before getting flushed to disk is 50ms, which is quite frequent. However, this has impact on the performance. If we increase the value/frequency, that would mean less num of write to disk, then performance might increase in some cases depending upon hte frequency of editing of db. But since log write is always sequential hence seek time is avoided and only cost is amount of write. Which means writing an optimum amount of data would yield best performance. User is adviced to weigh these local and use case parameters before changing the value of this configuration
  3. checkpoint:

    [CHKPNT_ENABLED, CHKPNT_FREQ] These two variables denote whether checkpointing is enabled or not and if enabled what is the frequency. Note that checkpointing can be OFF even though WAL (write ahead log) is ON. Checkpointing helps in recovering the data from log files quickly but adds overhead of frequcntly writing the checkpointing log (apart from the usual log) and flushing to disk. Hence if for a use case, time to recover from log is not important then one can switch it OFF for better performance
  4. Buffer flush reclaim frequency:

    [BUF_FLUSH_RECLAIM_FREQ] This value has been set after lots of experimentation and db seems to work well with the default number. However, one can play with this to find out sweet spot for differet use case. The configuration parameter denotes the frequency of checking the dirty pages and free pages for flush and reclaim respectively. Though db has several buffer pools for each table, but single free buffer list. This is to optimize and balance the buffer access and free page availability for all tables. However, db tries to maintain following;
    1. certain amount of free pages in the free list
    2. number of dirty pages in the pool for each table
    3. number of mapped but uptodate pages in the pol for each table
    The above three paramters along with the page access pattern defines the prefetch of pages(aync and async). They also define page flush psttern and mechanism. DB strives hard to flush certain number (at minimum) of sequential pages as possible, DB also works hard to follow the access pattern and reads bunch of pages for future access by db. The frequency number plays important role here and has an impact on db performance when set different values. It's recommended to leave the value as it if not too sure or play with different numbers and pick the best one
  5. minimum scan for uptodate page:

    [MIN_UPDATED_SCAN] When in crunch db doesn't find any free page in the free list then typically it calls for emergency handling. If db waits for certain period then various background workers might obviate the need for emergency handling in near future. Since the emergency handling is a costly affair hence it's always better to exhaust other ways of finding a free page. To do this, db uses the config value to scan the lru list to look for a non-dirty page which it can use as a free page as flush is not required

There are more advanced config values and we will cover them soon