Buffer Pool - BangDB

BangDB implements its own buffer pool where it maintains the pages for index and data files. The buffer pool is basically a hash table with linked nodes. It also contains several other lists such as lru list, dirty page list, prefetch list etc.. There is another list, free list which maintains the free pages. The buffer pool is created for each table whereas the free list is per database

Having own buffer pool, allows db to treat different pages in the db differently. This gives the db an edge over simply relying on OS pool as that would not apply the local contextual information for page flush or fetch algorithm. Whereas BangDB can leverage these data and apply them in having better suitable adaptive sequential page prefetch and flush algorithm

Few basic components of the overall page, IO, memory management components are discussed below;

Buffer Pool for

  • Demand Paging
  • Page Fault
  • Swapping

Custom Memory Manager, Slab allocator

IO layer

  • Buffer Management
  • Elevator (merge and sort)
  • Vector IO
  • Page prefetch - Spatial Locality
  • Page flush - Temporal Locality

Page Cache

  • Concurrent Hash table
  • LRU lists, Dirty lists, Free lists, Prefetch lists

background jobs

  • Dirty page flush
  • Page reclaim
  • Various scanning
  • Cache warming

The buffer pool consists of headers hash table, lru list, dirty page list and free list. This is how the buffer pool would look like;

Buffer pool maintains few background worker for following;

  • Measuring amount of dirty pages in the pool and accordingly plan to flush them
  • Measuring overall free pages numbers and take appropriate decisions
  • Adaptive page pre fetching - computing when and how many to fetch
  • Adaptive page flushing - computing when and how many to flush

The background workers ensure that when in need, page is should be found in the pool with as high probability as possible. It also ensures that when in need of a free page, a free page is found in the free list. This makes the threads, which are performing some operations, not block and wait for either a particular page or a free page. In dire situation where a free page is not found, db initiates emergency handling in order to pull back the health of pool to narmalcy

There are numerous other measures taken by the pool to ensure that the perofrmance of db remains high, amount of file IO remains low, IO with as much sequential pages are possible are carried out and in the end high throughput is maintained even in stressed scenario.

With buffer pool, BangDB can work on lot more data than actually the amount of RAM and with greater efficiency and performance. For example using a commodity machine with 8GB RAM, one can handle several hundreds of GBs of data with higher performance numbers. However, there is no limitation on the amount of data that db can handle irrespective of the amount of RAM. If we use SSD in place of hard disk, the performance of db becomes very high