Filter for scan and data retrieval in BangDB
For scanning data in BangDB, we may use primary key based scan or secondary key based scan or text key(reversed) based scan or all of these together. This makes the data scan a very robust and flexible process. To help users to deal with definition of these queries, we use dataQuery type. It's not required that we use this type, we could simply write the query in json form and operate
Scan always returns resultset, which is nothing but an iterable list of key, val which allows certain operations as well. This list defined by type resultset.
Scan may return NULL as well if error is encountered, hence user has to handle NULL as well.
Since table or stream may contain large amount of data, hence it will not be able to return all of them at once, hence it will keep returning as required or called by the user.
User may set the limits as well and certain other conditions for filtering. These affect the way data is retrieved and also amount of data is retrieved, both in terms of number of rows or size of the data. It is defined by ScanFilter;
Let's look at the typical scan api in the BangDB. It has following signatures;
For WIDE_TABLE or non-primary key based scan, we have detailed discussed below;
Non primary key based scan
Apart from primary keys, we can use secondary and text(reverse) keys to query data. If we create indexes on these secondary keys then it will boost performance but the index is not required for querying data using these secondary non-primary keys. However, it's highly recommended to strategically create these secondary, reverse indexes for high performance and efficient query
Now, let's see what's these secondary keys are. Let's consider a sample event or doc/data;
query1 = using "name", ex; where "name" = "sachin" etc... query2 = using "address.home.city" = "bangalore" query3 = using match text, like "quality, thought" [ Note, we use reverse index here, search with list of tokens ] and so on... Further we may wish to organize a key in composite manner for the suitability of use cases Here in this doc, we have primary key as long, string or composite and then query using primary key in interesting ways; While long, opaque and string primary keys are fine, composite key is quite interesting and useful in many scenarios; Let's say we wish to have primary key as composite key with following arrangement; city:name or city:org:name etc. Now we have quite flexibility in querying in different manner; query4 = find all docs where city could be any city but name is "sachin"; here we may use *:sachin Or name has "sac" as initial characters *:sac$% [ $% means match everything before these chars but after ':' ] Or any city and any name as long as org is "bangdb" *:bangdb:* [ using city:org:name as key arrangement ] query5 = find all the doc where home.city is equal to office.city home.city = $office.city This allows users to scan with data present in the doc itself [ helpful in stream ]