bg
bangdb_table – Embedded C++ – BangDB = NoSQL + AI + Stream

bangdb_table – Embedded C++

chevron

bangdb_table

Embedded, C++

There are very few APIs here to deal with. For simplicity, naming has been done to help understand the API;

Here is the convention;

NORMAL_TABLE and PRIMITIVE_TABLE Key/val - operations val is opaque data, text, fixed native type etc. index is not supported put(), get(), del(), scan() are the ops WIDE_TABLE Document data index is supported put_doc(), scan_doc(), get(), del() are ops put_text() & scan_text() are used when we wish to store text with reverse indexing fr entire text. The text is normal sentence and not necessarily json (mostly not json, for json use put_doc() and scan_doc() LARGE_TABLE Large data, files etc. put_file(), put_large_data(), get_file(), get_large_data(), and few more apis Index can't be created primary key has to be COMPOSITE type
Following is the API details;
int closeTable(bangdb_close_type tblCloseType = DEFAULT, char *newname = NULL, bool force_close = false);
This closes the table and return 0 for success and -1 for error. bangdb_close_type is as defined above. In case we wish to give new name to the table and close then we can do this, and db will simple rename it. Table maintains open reference count hence it will not close if there are open references to the table. When open reference is 0 then it will close. However, if we wish to override this behaviour then we must pass force_close = true.
int addIndex(const char *idxName, table_env *tenv);
This is the generic api for adding index for a table. The table_env describes the index type etc. It returns 0 for success and -1 for error.
//always creates index with following property BTREE, INMEM_PERSIST, QUASI_LEXICOGRAPH, SORT_ASCENDING, log = off int addIndex_str(const char *idxName, int idx_size, bool allowDuplicates);
This is special helper function for index creation. When we wish to add index for string type, then we use this method. idx_size is size in byte for the index keys that should be allowed. Allowduplicates sets if duplicate values will be allowed or not. It returns -1 for error and 0 for success.
int addIndex_num(const char *idxName, bool allowDuplicates);
This is special helper function for index creation. When we wish to add index for number type, then we use this method. Allowduplicates sets if duplicate values will be allowed or not. It returns -1 for error and 0 for success.
int dropIndex(const char *idxName);
This will drop the index and clear the related data from the file system and database. It returns -1 for error and 0 for success
bool hasIndex(const char *idxName);
This returns boolean if the given index is defined for the table or not.
table_env *getIdx_table_env_ref(const char *indexName, bool copy);
This returns reference to the table_env for the index for the given indexname. If you just need reference then set copy = false, in this case never delete the returned reference. If you wish copy then you must delete and clear the memory. For error it returns NULL.
table_env *get_table_env_ref(bool copy);
This returns the table_env reference for the table. If copy is true then it returns copy of the reference and user must delete it after use else it simply returns the reference which user should never delete. For error it returns NULL.
int dumpData();
This dumps the data for the table which forces all data for the table to be written on the filesystem. It returns -1 for error and 0 for success.
const char *getName();
This returns name of the table, it’s just the reference and user should never delete it. It returns NULL for error.
const char *getTableDir();
This returns full table path on file system, else returns NULL for error. This is just reference to the tabledir therefore user should never delete this.
bangdb_index_type getIndexType();
This returns index type for the table. Index type is for the primary key for the database. Here are the options of the index type;

HASH, Not Supported
EXTHASH, Supported, for hash keys
BTREE, Supported, for sorted order
HEAP, Deprecated
INVALID_INDEX_TYPE, Invalid type
const char *getStats(bool verbose = true);
This will return json string for table stats. Verbose will dictate the brevity of the response. For errors, it will return NULL
// for files - supported only for LARGE_TABLE
long put_file(FDT *key, const char *file_path, insert_options iop);
This is only supported for Large Table type (see bangdb_table_type). We can upload small or very large file using this api. Key is typically file id (string only) and file_path is the actual location of file on the server. As of now it takes local file path, but in next version it may take network or url as well. insert_options define how do we wish to put the file or data and the options are;

INSERT_UNIQUE, if non-existing then insert else return
UPDATE_EXISTING, if existing then update else return
INSERT_UPDATE, insert if non-existing else update
DELETE_EXISTING, delete if existing
UPDATE_EXISTING_INPLACE, only for inplace update
INSERT_UPDATE_INPLACE, only for inplace update

Last two options should be used with caution and we will discuss more on this later
This returns 0 for success and -1 for error
long get_file(FDT *key, const char *fname, const char *fpath);
This is only supported for Large Table type (see bangdb_table_type). We can get the file from the server identified by the key and name the file as fname and store in the fpath on local system.
This returns 0 for success and -1 for error
long put_large_data(FDT *key, FDT *val, insert_options iop);
This is only supported for Large Table type (see bangdb_table_type). We can use this api to put large binary data (not file) identified with key (string only). Iop describes the insert options as explained above.
It returns 0 for success and -1 for error
long get_large_data(FDT *key, char **buf, long *len);
This is only supported for Large Table type (see bangdb_table_type). We can use this api to get large data from the table identified with key. The data will be stored in buf and length of the data in len variable.
For success it returns 0 else -1 for error
char *list_large_data_keys(char *skey = NULL, int list_size_mb = MAX_RESULTSET_SIZE);
This returns list of large data keys. list_size_mb defines if we need to restrict the list size, default is MAX_RESULTSET_SIZE. The return value is json string, and it contains the last key which should be used for recursive subsequent calls
int count_slice_large_data(FDT *key);
This is only supported for Large Table type (see bangdb_table_type). BangDB stores large data in chunks or slices and this api will help us count the slices for the given data (file or binary data) identified by the key.
It returns num of slices for success else -1 for error
long count_large_data();
This is only supported for Large Table type (see bangdb_table_type). It returns count of large data in the db, else -1 for error
int del_large_data(FDT *key);
This is only supported for Large Table type (see bangdb_table_type). It deletes the large data identified with the key and returns 0 on success and -1 for error.
// for opaque data long put(FDT *key, FDT *val, insert_options flag = INSERT_UNIQUE, bangdb_txn *txn = NULL);
This is used for Normal Table type (see bangdb_table_type). It puts key and val in to the table. If this put operation is within transaction boundary then pass the transaction reference as well.
It returns 0 for success and -1 for error
long put(FDT *key, DATA_VAR *val, insert_options flag = INSERT_UNIQUE, bangdb_txn *txn = NULL);
This is used for Normal Table type (see bangdb_table_type). It puts key and val in to the table. If this put operation is within transaction boundary then pass the transaction reference as well.
Here it’s very similar to previous put, except that it takes DATA_VAR for val, where user can define few other things and also use pre allocated buffer for val, useful when we want to avoid too many memory creation and deletion on heap.
It returns 0 for success and -1 for error
resultset *scan(resultset *prev_rs, FDT *pk_skey, FDT *pk_ekey, scan_filter *sf = NULL, DATA_VAR *dv = NULL, bangdb_txn *txn = NULL);
This is used for Normal Table type (see bangdb_table_type). This scans the data between sk and ek, the two primary keys. Either or both of these primary keys could be NULL. The scan_filter describes how to scan.
Please note that the prev_rs argument should be NULL for the first call and for subsequent call it should be the rs returned on the previous call. This to ensure that recursive scan works without any issues.
Here is the definition of scan_filter;

scan_operator skey_op; // default GTE
scan_operator ekey_op; // default LTE
scan_limit_by limitby; // default LIMIT_RESULT_SIZE
short only_key = 0; // if we wish to retrieve only key and no value
short reserved = 0; // see notes below;
int limit; // default 2MB (MAX_RESULTSET_SIZE) for LIMIT_RESULT_SETSIZE
int skip_count; // this is set by the db during scan, don’t touch
void *arg; // any extra arg, interpreted by the callee

reserved
reserved has different meaning for different numbers;
0 - default value, don't do anything [ no interpretation ]
1 to 9 - select the key for secidx for this idx in the array [ in the order of defining the idx ], note this value starts from 1, in code it should be from 0 (i-1)
10 - select first one only from sec idx [ among duplicates ] for EQ
11 - select last one only from sec idx [ among duplicates ] for EQ
12 - means interpret arg as secidx_pkinfo object ptr
13 - use linear scan and not through secondary indexes
14 - use partial as well, useful for scan_text, rev idx scan, we would like to select partial ones as well
// for text data, supported for only WIDE_TABLE // reverse indexes the data (str) // FDT *key, if null then timestamp long put_text(const char *str, int len, FDT *k = NULL, insert_options flag = INSERT_UNIQUE);
This api is for wide table only. it’s used to put text which will be reversed indexed fully. User may provide some key or else timestamp will be used.
supported for only WIDE_TABLE It returns 0 for success and -1 for error.
resultset *scan_text(const char *wlist[], int nfilters, bool intersect = false);
This is to search using list of keys/ tokens/ words. Wlist is a list of all tokens that would be used for searching, nfilters tells the number of tokens in the list, and intersect boolean tells if we need to search with OR(false) or AND(true) condition.
It returns resultset for success or NULL for error.
long put_doc(const char *doc, FDT *pk = NULL, const char *rev_idx_fields_json = NULL, insert_options flag = INSERT_UNIQUE);
This apis for for wide table only, and to put json document pointed by doc. Pk is primary key if any, rev_idx_fields describes if there are set of fields that should be reversed indexed.

rev_idx_fields_json = {\"_rev_idx_all\":0, \"_rev_idx_key_list\":[\"name\", \"city\"]}

Secondary indexed are defined using addindex api as described previously. put_doc updates all indexes accordingly.
If pk is NULL then BangDB uses timestamp as key, if rev_idx_fields_json is NULL then it doesn’t do reversed indexing and default insert_options is INSERT_UNIQUE.
Upon success, it returns 0 else -1.
resultset *scan_doc(resultset *prev_rs, FDT *pk_skey = NULL, FDT *pk_ekey = NULL, const char *idx_filter_json = NULL, scan_filter *sf = NULL);
This is used for wide table only. It’s used for scanning the db for query.
Query could combine primary index, secondary indexes and reversed indexes (through idx_filter_json)
idx_filter_json can be defined directly by writing a json query or dataQuery type could be used to create the query in simple manner. Here is how the query looks like;

"{\"query\":[{\"key\":\"city.name\",\"cmp_op\":4,\"val\":\"paris\"},{\"joinop\":0},{\"match_words\":\"sachin, rahul\",\"joinop\":1,\"field\":\"name.first\"}]}"

The query is combining secondary index “city.name” and reversed index “name.first”. Joinop = 0 means AND, therefore, fetch all the documents where name of the city is paris and first name contains sachin or rahul.

Or
{\"query\":[{\"key\":\"name\",\"cmp_op\":4,\"val\":\"sachin\"},{\"joinop\":0},{\"key\":\"age\",\"cmp_op\":0,\"val\":40}]}

Here the query is find all documents where name is “sachin” and age is greater than 40. Both name and age are secondary indexes. We don’t use reversed index here
Further, we can query for following;

{\"query\":[{\"key\":\"price\",\"cmp_op\":3,\"val\":\"$quote\"}], \"qtype\":2}

Here query says find all documents where price is less than quote in the same doc.
Etc.
Please see the query section for detail discussion on this.
Upon success it returns resultset else NULL for error
int get(FDT *key, FDT **val, bangdb_txn *txn = NULL);
This could be used for any table except large table. Given a key, it will return value in val attribute. This returns 0 for success and -1 for error
int get(FDT *key, DATA_VAR *val, bangdb_txn *txn = NULL);
This could be used for any table except large table. Given a key, it will return value in val attribute. Please note val is DATA_VAR here, which can be used for avoiding creation of too many object on heap and then deletion. This returns 0 for success and -1 for error
long del(FDT *key, bangdb_txn *txn = NULL);
This could be used for all table types. It deletes the data defined by key. It returns 0 for success else -1 for error.
long count(FDT *pk_skey, FDT *pk_ekey, const char *idx_filter_json = NULL, scan_filter *sf = NULL);
We can cound the number of documents, or rows using this method with supplied query filter. This could also take primary index, secondary indexes and reversed index all together or as needed. It returns count if successful else -1 for error
long exp_count(FDT *skey, FDT *ekey);
This api returns expected count between two keys. Please note this is not the exact count but a rough measurement. If there are large number of keys in the table and we wish to know rough estimate of count, then this function can be very efficient and fast with very less overhead. Returns count if successful else -1 for error.
long count();
This convenient function or special api for previous count() api where it counts all the documents or rows in the table. It works for normal and wide table
void print_stats();
This api prints stats of the table.
void setAutoCommit(bool flag);
This is used if we wish to enable auto commit for single operation.
long get_next_lsn();
This is to get lsn (last sequence number) for wal. We will discuss this later in WAL section
bool isSameAs(bangdb_table *tbl);
Returns true if this table is same as the given table, else returns false
bangdb_table_type get_table_type();
This returns table type, more covered in bangdb_table_type subsequently
static void add_double_to_fdt(double d, FDT *f);
helper function to create fdt from double
static void add_long_to_fdt(long d, FDT *f);
helper function to create fdt from long
static void add_string_to_fdt(const char *d, FDT *f);
helper function to create fdt from char pointer