BangDB replication can be set using few configs, provided in bangdb.config or in command line arguments. Based on these config, BangDB sets itself to run in replicated mode as configured or expected. Following different modes are required for BangDB to achieve this:

  1. Master
  2. Slave

There could be only single Master node that can be there in any cluster, and the maximum numbers of slaves can be as configured (default is 2).

Configurations using bangdb.config file

Edit the bangdb.config file and set following params as per your environment variables.

For Master

SERVER_TYPE = 0
ENABLE_REPLICATION = 1
SERVER_ID = 0.0.0.0
SERVER_PUBLIC_IP = <public ip of the server (this server which is master)>
SERV_PORT = 10101 // change it based on the port num you use for master
MASTER_SERVER_ID = <public ip of master>
MASTER_SERV_PORT = <port of master>

For Slave

SERVER_TYPE = 1
ENABLE_REPLICATION = 1
SERVER_ID = 0.0.0.0
SERVER_PUBLIC_IP = <public ip of the server (this server, which is slave)>
SERV_PORT = 10101 // change it based on the port num you use for slave
MASTER_SERVER_ID = <public ip of master>
MASTER_SERV_PORT = <port of master>   

That's it. Now start the Master first and then slave

Configurations using command line argument

For Master

./bangdb-server-2.0 -r yes -i master -p <public_ip>

For Slave

./bangdb-server-2.0 -r yes -i slave -p <public_ip> -m <master_public_ip>:<master_port>

Failover

We can set up failover in two ways as following:

  1. Auto Failover
  2. This can be enabled by setting following config param to 1 in the bangdb.config file.

    AUTO_SWITCH_OVER = 1
    // Or by adding "-a yes" to the command line argument

    The Auto switch has two options:

    1. Conservative
    2. When slaves see that the master is no longer available, all the slaves will turn themselves to master. It will be then upto the user to switch all other master (except one) to the slave of the remaining master. Basically, user will have to pick one master from the other masters and make all other masters slaves to this newly picked master.

      This approach is very conservative and tries to maximize the chances of finding another master as soon as possible. However, there is a manual work involved in turning other masters to the slaves of the new master.

    3. Optimistic
    4. When slaves see that master is no longer available, the slave with SLAVE_ID = 1, becomes the new master. All other slaves then switch themselves to this new master automatically. If the remaining slaves for some reason (either when new master is again unavailable or slaves are unable to connect to the new master) find the new master unreachable, then one of the remaining will turn itself to a new master.

      This is how we enable this scenario.

      When the original master comes back, then it checks if there is another master already in the network, and if yes then it becomes slave of the already running master. Therefore, the entire orchestration ensures that master is always available in the cluster and the machines can join the cluster at any time relying totally on the cluster to ensure that they join properly and automatically.

      Slave checks if there is no master in the network, then one of them becomes slave. If master re-joins, it checks whether if a master is already available, and if yes then it becomes a salve to the new master. And this can keep going without any human interruption Following section describes this in detail using an example.

      Let's say we have one master and two slaves.

      Then for slave 1 (192.168.1.107, 10101), set following in bangdb.config

      SLAVE_ID = 1
      SLAVE_1_IP = 192.168.1.107
      SLAVE_1_PORT = 10101

      For slave 2 (192.168.1.105, 10101)

      SLAVE_ID = 2 
      SLAVE_1_IP = 192.168.1.107 
      SLAVE_1_PORT = 10101

      That's it! Now run the master and slaves, try to remove master and following will happen.

      1. Slaves 1 and 2 will note that master is not available
      2. Both slaves will try for PING_THRESHOLD times with PING_FREQ latency
      3. Slave 1 (with SLAVE_ID=1), will become new master, after some time
      4. Slave 2 (with SLAVE_ID=2), will try to check the new master (it will check type also to ensure this is the new master)
      5. Slave 2 may try for few times (PING_THRESHOLD) to contact new master
      6. If Slave 2 is successful in contacting new master and establishing that Slave 1 is the new master, then it will switch itself to a slave of the new master
      7. If Slave 2 is not successful in contacting the new master, then it will turn itself to a new master
      8. If the original master, then comes back, then it will check if there is another master in the network, if yes (in this case SLAVE = 1 is the new master, for ex.) then it will become slave of this new master.
  3. Custom failover [ manual or through script]
  4. Many a times, we may need bit complex logic to switch the slave to a master or auto failover and developers, devops wish to write their own script to manage the logic and then need a command to switch the slave to master. In this case, we can use the commands available in the CLI (as described below). Here devops would typically implement their own script to find out right time to switch the slave to a master and then use the command to execute this.

Server Information (Master and Slave)

  1. Using Commands
  2. BangDB installs bdbc_s (bdbc for non-ssl) in the /usr/bin folder, so this can be used from anywhere.

    Ping the local server

    bdbc_s -p ping

    Ping remote server

    bdbc_s  -s <ip>:<port> -p ping

    We can run all the commands available in the CLI (discussed below), from cmd as well.

    bdbc_s -c "any commands …"

  3. Using REST API
  4. Check the server type

    GET https://ip:port/server/type

    Returns

    {"type":"master","slaves":2}

    Or

    {"type":"slave"}

    Check if the server is master

    GET https://ip:port/server/is_master

    Returns

    HTTP error: 400

    {"msg":"request seems to be malformed, may check if requested resource is proper","errcode":400}

    Or

    HTTP error: 200

    {"is_master":1}
  5. Using CLI
  6. We can use CLI to check the information about the servers. CLI can also be used to perform some other actions as required from sysadmin perspective. Note that all the cli commands are also available for bdbc_s command with -c option.

    Run the cli, connect it to either master or slave

    ./bangdb-cli-2.0 -s <server_ip>:<port> 
    // here server is to which you wish to connect to

    To see help for repl:

    help repl
    ++++++++++++++++++++++++++++++++++++++++++ replication ++++++++++++++++++++++++++
    server repl state change command
    ________________________________
    register master where server = ip:port
    register slave where server = ip:port and master = ip:port
    show servertype
    show servertype where server = ip:port

    etc… please visit www.bangdb.com/developer for more info. Check server type

    show servertype
    server [ 192.168.1.107 : 10101 ] is master with num of slaves = 1

    Check server type for another server (not the one to which cli is connected, for ex: some slave)

    show servertype where server = 192.168.1.105:10101
    server [ 192.168.1.105 : 10101 ] is slave

    Now switch slave to master. Note this will not affect current Master, only switch the slave into a master (another one). In this case, the 192.168.1.105:10101 is the slave, so we will switch it to master.

    register master where server = 192.168.1.105:10101
    successful in switching server [ 192.168.1.105 : 10101 ] to a [ master ]

    Now again check the server type of older slave (see above)

    show servertype where server = 192.168.1.105:10101
    server [ 192.168.1.105 : 10101 ] is master with num of slaves = 0

    If we see the servertype of another master (192.168.1.107:10101), we will see that it's still a master.

    Now, switch another master (192.168.1.107:10101) to a slave of this new master (192.168.1.105:10101)

    register slave where server = 192.168.1.107:10101 and master = 192.168.1.105:10101
    successful in switching server [ 192.168.1.107 : 10101 ] to a [ slave ]

    If you see the logs, the new slave is now syncing with the master.

    Check the server type of the new slave

    show servertype
    server [ 192.168.1.107 : 10101 ] is slave

    We have totally flipped the master and slave using the above commands using CLI.

    We can use these commands to set up our own mechanism to switch the servers depending upon the custom workflow and scenarios. Typical ways to do this would be a script which checks the health of servers and when it sees master is no longer available, switch one of the slaves to master, and switch other slaves to slaves of this new master.

Important Points

  • Start master and then slaves
  • Let slave 1 complete the sync up and then attach another slave
  • Two slaves are good enough for most of the cases, however you may add more [ MAX_SLAVES controls this]
  • PING_THRESHOLD and PING_FREQ should be kept same for all slaves and master [ not mandatory but better]
  • Set REPLICA_READ_WRITE=0 [ by default it's 0] - this ensure write only on Master
  • Use cmd [ bdbc_s -p ping] to check health of the server. The server responds only when DB is up and ready to accept new requests. This is much lighter and faster that REST GET [ https://ip:port/db or other such APIs] API. The overhead is almost 10 times less with cmd
  • Try to keep PING_THRESHOLD = 7 to 10 or more and PING_FREQ = 7 - 10 sec [ 10 sec is good]
  • BangDB uses UDP based ping hence it's very lightweight. Since it's connectionless call, hence PING_THRESHOLD is needed. Note that, BangDB will figure out that the master is not available only when PING_THRESHOLD times consecutively it fails to get response from master
  • For auto failover with “optimistic” approach, SLAVE_ID should start with 1 and then 2, etc. Slave with SLAVE_ID=1 will be the one who will become master first and other slaves (with SLAVE_ID>1) will become the slave of new master (SLAVE_ID=1)
  • Must provide SLAVE_1_IP and SLAVE_1_PORT for all the slaves. SLAVE_1_IP and SLAVE_1_PORT are ip and port for slave with SLAVE_ID=1 and so on.