Table Type Concepts

BangDB supports following types of tables

  • Normal Table
  • Primitive Table
  • Wide Table

Normal Table

The normal table is for simple key and values where value is always treated as opaque data. The index on normal table is not supported. By default key is also opaque however, user can override the behaviour and use long as key type.

BangDB only create primary index for normal table. The layout of the index depends on the index type definition as defined by the user in table_env or in bangdb.config file (BANGDB_INDEX_TYPE). The allowed index_types are Btree or Hash for normal table.

Normal table can be created or opened using the gettable() API of database. This returns pointer to the table on success. To work on normal table, user needs to get a connection to the table. The api for getting a connection for normal table is getconnection().

Normal table is thus suitable for the scenario where key could be of long type or of fixed size string type and value is variable opaque type and the primary index on key is sufficient and no more overhead is required. Data/value is always retrieved through the give key or range of keys or by scanning the whole db.

Primitive Table

Primitive table is also simple key and value store except that for this table both key and value are of fixed size. The value is always of int or long type whereas key could be of int, long or fixed size string(char*). This a special use case scenario but occurs in high proportion while doing analysis or storing native data types. Note that user may still use normal table to store the key and value as in this case but primitive table is much better option both in terms of efficient use of resource and of better performance.

Primitive table can be created or opened using getPrimtiveTable() API of database. It returns pointer to table object. Note that primitive table should always be created using getPrimitiveTable API and not by defining bangdb_table_type in the table_env. Infact user should never set this property on table_env. The getPrimitiveTable() API requires user to define bangdb_primitive_data_type and here user should tell what kind of primitive data type this table is being created.

Note that getPrimitiveTable() returns the pointer to the table object only as there is no separate type defined for the primitive table. However, user needs to get connection to the primitive table using getPrimConnection() API of table, which returns pointer to primConnection object.

Wide Table

Wide Table is a key value store but allows multiple indexes to be created on data. Here the data could be structured or unstructred but creation of index is still allowed in both the cases. With BangDB 1.5 version, un limited number of indexes can be created for json (structured) data and only single secondary index can be created for opaque data. This limitations for opaque data will be removed in coming minor release soon. Also in upcoming release, column family type would be supported in wide table as well.

The key type here again can be of fixed size string/opaque, or long type as defined in table_env (default is of fixed size string/opaque). The index on data fileds can be created for string type or for int/long type.

Wide table can be created using getWideTable() API of database. This returns pointer to wideTable object. To get connection to the table, call getconnection() of wideTable, which returns pointer to wideConnection.

The wideTable is similar to that table except that it has APIs for creating and dropping indexes. The API for creating index is addIndex() and for dropping is dropIndex(). Two helper add index APIs are also provided which are used in many of the cases namely addIndex_str() and addIndex_num().

The wideConnection is very similar to that of connection object (of normal table) except that it has APIs for put and scan using index. Wide connection for clarity has all the put and scan API for document data (structured data) with "doc" suffixed to the actual name. For ex; put_doc() or scan_doc(). User should use these APIs when dealing with documents or json data.

Wide Table is useful when more than single primary key index is required. With json data user can define any number of index, and they may be nested as well. Once the indexes are defined user may then keep on adding data to the table and db would in turn ensures that the indexes are created. To retrieve data, user may can scan() with appropriate index values and in turn get the resultset. Note that user may call scan on different indexes and get multiple reslutsets. The result sets allow user to operate on them to get the final results. The supported operations are add, append, intersect which adds two result sets (doesn't add the duplicate/common ones), appends the two result sets (doesn't care about duplicates/common ones and simply appends) and finally intersects the two result sets(picks only the common ones)

To summarize, the following table would be helpful;

table type table object connection object key type value type index support
normal table table connection fixed (string or long type) variable (string or opaque) no
primitive table table primConnection fixed (string, int, long type) fixed(int or long type) no
wide table wideTable wideConnection fixed (string or long type) variable (structured or unstructured) yes