Core database design

BangDB core database is designed and implemented for following high level goals;
1. DB should have full control on data
2. DB implements its own buffer pool and cache
3. DB should leverage the machine’s resources fully
4. Read, write both should be fully concurrent
5. DB should have full control on IO and manage it well
6. Data should be durable and recoverable in case of system crash
7. DB should support transaction
8. Key arrangement should be sorted and hashed
9. DB should manage memory efficiently
10. Performance of DB should be very high
11. DB should run in devices to big servers
12. DB should be persistent and in-memory
13. DB should retain performance even when data overflows the working memory size
12. DB should manage itself and do housekeeping as required without admin
BangDB implements its own buffer pool where it maintains the pages for index and data files. The buffer pool is basically a hash table with linked nodes. It also contains several other lists such as lru list, dirty page list, prefetch list etc..
There is another list, free list which maintains the free pages. The buffer pool can be private and created for each table or could be shared (default) whereas the free list is per database
Having own buffer pool, allows db to treat different pages in the db differently. This gives the db an edge over simply relying on OS pool as that would not apply the local contextual information for page flush or fetch algorithm.
Whereas BangDB can leverage these data and apply them in having better suitable adaptive sequential page prefetch and flush algorithm
Few basic components of the overall page, IO, memory management components are discussed below;
Buffer Pool for
BangDB implements write ahead log. Basically all the changes to data and index files are written only after the log is written. The logging is available as part of one of the configuration parameters and user can set it on/off as required. But once set for a table, the log will remain in same position (on/off) for all its life.
Write ahead log is always sequential and multiple sequential logs file are being generated in the life of the db. BangDB tweaks the basic design and algorithm for write ahead logging in order to reduce the amount of data to be logged and thereby improving performance without sacrificing on the core basis and benefits of the write ahead logging. It uses modified ARIES algorithm to implement the wal. The modification is basically to optimise the log-size/data-size ratio to improve the performance
Following are the benefits of write ahead log
BangDB is designed to run in persistent or in-memory mode. By default it runs in persistent mode and log may be switched ON or OFF as per configuration. By default log is always ON. However, there are scenarios where in-memory could be best, for ex; cache.
But in majority of the cases, persistent mode is best suitable.
When db is run in in-memory mode then log is switched off, transaction can't be ON and it works within the allocated memory budget.
Whereas when it runs in persistent mode with log ON, then it can go beyond working memory set and still remain performant. BangDB implements IO Layer and mechanism to reduce the disk IO especially random IO to maximise the throughput.
BangDB gives very high performance and it's designed for the same. Read and write both are fully concurrent and hence it leverages the available resources on the machine to fullest of its capacity. Also, it implements IO Layer which intelligently keeps track of several data structures and takes right decision or predict best next steps when it comes to minimising IO for higher performance
Here is simple Billion IOPS data on a commodity hardware with less 4 core and 8GB RAM