Architecture - BangDB Server

Introduction

BangDB Server is stand alone process which accepts client's connections, process the requests and sends the reponses to respective clients. The requests and responses are simple tcp messages with custom header and body format. Below is some technical details on the server. In this document, we will cover the architecture specific to the server and not of the BangDB internals which you can find here

Staged Event Driven

BangDB server follows form of staged event driven architecture (SEDA) which is suited very nicely for highy concurrent network server. This design approach supports massive concurrency and scales very well with load with low overhead and without committing extra resources upfront. SEDA combines threads and events based programming models to manage concurrency

Stages

The server runs multiple stages seprated by queues and each stage in turn is reponsible for doing a kind of work. All these stages and number of stages are configurable. For ex; if configured in a way, then one stage might take care of listening to incoming requests, one for reading the msg, the other for processing the received msg and another one for sending the responses to the clients. Hence the server is stage driven and with number of stages available as configurable parameter. This approach makes the server a well conditioned one which scales pretty well even with increasing load and connections in a highly stressed scenario

Finite State Machines

The event driven approach implements every individual task to flow through the system as finite state machine (FSM). Hence a task equivalent to set of its states within the system

Task = {S0, S1, .... , Sn}

BangDB handles the FSM with the multiple stages withing the system hence a particular stage could contian or map to subset of the states of the task or FSM. That is;

Stage1 = {S0, S1}; Stage 2 = {S2, S3, S4}; ...

By doing this we avoid the major challenge in handling the FSM by typical event driven system and thus instead of having monolithic piece responsible for all states, we have multiple loosely coupled stages each worrying about few states. Each stage can be flexible take its own time to transition from state a to state b without affecting or relying on other stage states transition process

Quick view of a stage

Massive Concurrent Connections

Various clients connect to the server and send requests in ad-hoc manner and wait for the responses from the server. The clients can create persistent connections or a transient one based on their configuration. The server design is flexible and scabale enough to handle more than 10K concurrent clients without putting much overhead on the server, hence server does address the C10K problem which makes it very powerful and scable network data sever

The server implements epoll for multiplexing and handles the requests in asynchronous manner. It creates the epoll instance in edge triggerred fashion for better performance in client server scenario. Internally it also uses iqlectEQ, which is messaging platform written from scratch for dealing with messages and allowing users to write custom handler for processing requests. More on iqlectEQ in separate doc. This configuration along with the staged event driven appraoch allows us to create a well condtitioned server which scales pretty well and at the same time in high load scenario degrades gracefully

Server initialization code

Introduction

Typical code to create an instance of server would require following. This is already coded in the server with some configuration provided to the user through bangdb.config to change some behavior but user need not have to code or worry about it, this is purely for inquisitive reader;

	//create stages, one for polling and other for read/db ops/write
	poll_stage = new stage((char*)"poll_stage", 1);
	all_stage = new stage((char*)"all_stage", num_cpu);

	//slab allocation and eventmanager
	sa = new slaballoc();
	sa->init();
	em = new eventmanager(sa);
	em->init();

	//create handlers
	pev = new poll_eventhandler(em);
	rowev = new read_ops_write_eventhandler(em);

	//register handlers with the stages
	poll_stage->register_handler(pev);
	all_stage->register_handler(rowev);

	//configure the stages as per design
	poll_stage->register_stage((char*)"all_stage", all_stage);
	all_stage->register_stage((char*)"poll_stage", poll_stage);

	//run the stages now
	poll_stage->run_stage();
	all_stage->run_stage();
	

We first create number of stages, then create our own memory slab allocator and event manager. Then we create various handlers and register them with the respective stages. Next we configure the stages where we link them based on our own design configuration. Finally we just let the server run

As you can see, the whole design is flexible, configurable and pluggable. We can extend the capacity or usability of server by writing our own custome handlers. The very idea is to break the whole service into multiple stages and reuse as much as possible to create complex event driven application or service

Schematic Diagram

Following are the simple diagrams to show how a two stage and four stage SEDA based server would look like. The first one has two stages namely poll and all(read/ops/write) while the second one has four stages namely poll, read, ops, write (basically the second stage in the first one broken into three separate stages). We have noticed that there is a optimal number of stages for a given scenario and for BangDB server, though user can set either of the two stages but the option 1 (two stage scenario) works better in terms of performance

Two stage design





Four stage design


As clear from above we can configure the server's stages only through the config file now, but in future when we release message framework iqlectEQ, user will have far too control over setting up the right architecture in place and also write custom handlers for their own need