Scan Operation and Range Query

In the previous section we reviewed the BnagDB objects and dealing with them. Now using the objects we will perform some operations. This section will review the *scan* operation

NOTE: The scan for documents or using indexes are not covered here, those will be covered in

Scan

The overview of scan is given below

API - without transaction

resultset* scan(const char *skey, const char *ekey, scan_filter *sf = NULL);

resultset* scan(FDT *skey, FDT *ekey, scan_filter *sf = NULL);

resultset* scan(LONG_T skey, LONG_T ekey, scan_filter *sf = NULL);

resultset* scan(int skey, int ekey, scan_filter *sf = NULL);

resultset* scan(const char *skey, const char *ekey, scan_filter *sf, DATA_VAR *dv);

resultset* scan(FDT *skey, FDT *ekey, scan_filter *sf, DATA_VAR *dv);

resultset* scan(LONG_T skey, LONG_T ekey, scan_filter *sf, DATA_VAR *dv);

API - with transaction

resultset* scan(const char *skey, const char *ekey, bangdb_txn *txn_handle, scan_filter *sf = NULL);

resultset* scan(FDT *skey, FDT *ekey, bangdb_txn *txn_handle, scan_filter *sf = NULL);

resultset* scan(LONG_T skey, LONG_T ekey, bangdb_txn *txn_handle, scan_filter *sf = NULL);

resultset* scan(const char *skey, const char *ekey, bangdb_txn *txn_handle, scan_filter *sf, DATA_VAR *dv);

resultset* scan(FDT *skey, FDT *ekey, bangdb_txn *txn_handle, scan_filter *sf, DATA_VAR *dv);

resultset* scan(LONG_T skey, LONG_T ekey, bangdb_txn *txn_handle, scan_filter *sf, DATA_VAR *dv);

FDT is the data type we use for defining keys and values, following is the structure of the FDT

	struct FDT
	{
		void *data;
		DATALEN_T len;	//DATALEN_T is unsigned int
		void free();
		void copy(FDT *v);
		~FDT();
	}
	

resultset is the scrollable, iterable cursor retsurend by the scan method, we discuss more on resultset here

scan_filter is provided to further add operators while doing the query and provide some limits on the returned resultset. Here is how scan_filter looks like;

	scan_operator skey_op;	//default GTE
	scan_operator ekey_op;	//default LTE
	scan_limit_by limitby;	//default LIMIT_RESULT_SIZE
	int limit;			//default 2MB (MAX_RESULTSET_SIZE)
	int skip_count;	//this is filled by db, user should leave it
	

scan_operator defines the less than , less than equal to, greater than and greater than equal to behaviour to follow while doing the range query. here is how it looks like;

	GT,	//greater than
	GTE,	//greater than equal to - default
	LT,	//less than
	LTE,	//less than equal to – default
	

Other interesting type is scan_limit_by. Typically when we do range query, we might retrieve large amount of data and many a times we might not want to deal with all the data returned. Hence we can limit the amount of data returned by the scan

There are two ways of limiting the amount of data;

  • limit by size of the resultset or size of overall data retruned
  • limit by number of rows

here is how the scan_limit_by looks like;

	LIMIT_RESULT_SIZE,	//defines the MB of data that should be returned (max)
	LIMIT_RESULT_ROW,	//number of rows (max) that should be returned
	

API - without transaction

ResultSet Scan(byte[] skey, byte[] ekey, ScanFilter sf = null)

ResultSet Scan(string skey, string ekey, ScanFilter sf = null)

ResultSet Scan(long skey, long ekey, ScanFilter sf = null)

API - with transaction

ResultSet Scan(byte[] skey, byte[] ekey, Transaction txn, ScanFilter sf = null)

ResultSet Scan(string skey, string ekey, Transaction txn, ScanFilter sf = null)

ResultSet is the scrollable, iterable cursor retsurend by the scan method, we discuss more on resultset here

ScanFilter is provided to further add operators while doing the query and provide some limits on the returned resultset. Here is how ScanFilter looks like;

	ScanOperator skeyOp;	//default GTE
	ScanOperator ekeyOp;	//default LTE
	ScanLimitBy limitBy;	//default LIMIT_RESULT_SIZE
	int limit;	//default 2MB (MAX_RESULTSET_SIZE)
	

ScanOperator defines the less than , less than equal to, greater than and greater than equal to behaviour to follow while doing the range query. here is how it looks like;

	GT,	//greater than
	GTE,	//greater than equal to - default
	LT,	//less than
	LTE,	//less than equal to – default
	

Other interesting type is ScanLimitBy. Typically when we do range query, we might retrieve large amount of data and many a times we might not want to deal with all the data returned. Hence we can limit the amount of data returned by the scan

There are two ways of limiting the amount of data;

  • limit by size of the resultset or size of overall data retruned
  • limit by number of rows

here is how the scan_limit_by looks like;

	LimitResultSizeByte, 	//defines the bytes of data that should be returned (max)
	LimitResultRow,	 	//number of rows (max) that should be returned
	

API - without transaction

public ResultSet scan(long skey, long ekey, ScanFilter sf);

public ResultSet scan(String skey, String ekey, ScanFilter sf);

public ResultSet scan(byte[] skey, byte[] ekey, ScanFilter sf);

public ResultSet scan(String skey, String ekey, ScanFilter sf, DataVar dv);

public ResultSet scan(byte[] skey, byte[] ekey, ScanFilter sf, DataVar dv);

public ResultSet scan(long skey, long ekey, ScanFilter sf, DataVar dv);

API - with transaction

public ResultSet scan(String skey, String ekey, Transaction txn, ScanFilter sf);

public ResultSet scan(byte[] skey, byte[] ekey, Transaction txn, ScanFilter sf);

public ResultSet scan(long skey, long ekey, Transaction txn, ScanFilter sf);

public ResultSet scan(String skey, String ekey, Transaction txn, ScanFilter sf, DataVar dv);

public ResultSet scan(byte[] skey, byte[] ekey, Transaction txn, ScanFilter sf, DataVar dv);

public ResultSet scan(long skey, long ekey, Transaction txn, ScanFilter sf, DataVar dv);

ResultSet is the scrollable, iterable cursor retsurend by the scan method, we discuss more on resultset here

ScanFilter is provided to further add operators while doing the query and provide some limits on the returned resultset. Here is how ScanFilter looks like;

	SCANOPERATOR skeyOp;	//default GTE
	SCANOPERATOR ekeyOp;	//default LTE
	SCANLIMITBY limitBy;	//default LIMIT_RESULT_SIZE
	int limit;	//default 2MB (MAX_RESULTSET_SIZE)
	int skipCount;	//this is filled by db, user should leave it
	

SCANOPERATOR defines the less than , less than equal to, greater than and greater than equal to behaviour to follow while doing the range query. here is how it looks like;

	GT,	//greater than
	GTE,	//greater than equal to - default
	LT,	//less than
	LTE,	//less than equal to – default
	

Other interesting type is SCANLIMITBY. Typically when we do range query, we might retrieve large amount of data and many a times we might not want to deal with all the data returned. Hence we can limit the amount of data returned by the scan

There are two ways of limiting the amount of data;

  • limit by size of the resultset or size of overall data retruned
  • limit by number of rows

here is how the scan_limit_by looks like;

	LIMIT_RESULT_SIZE, 	//defines the bytes of data that should be returned (max)
	LIMIT_RESULT_ROW,	 	//number of rows (max) that should be returned
	

Typical use of scan is as follows

Range query between two arbitrary keys. These keys could be exact or partial keys

	
	char *k1 = new char[17];
	memcpy(k1, "partial start key", 17);
	
	char *k2 = new char[19];
	memcpy(k2, "the partial end key", 19);
	
	FDT *sk = new FDT(k1, 17);
	FDT *ek = new FDT(k2, 19);
	
	resultset *rs = NULL;
	scan_filter sf;
	
	//let's override the default way of scanning
	sf.skey_op = GT;
	sf.ekey_op = LT;
	sf.limitby = LIMIT_RESULT_SIZE;
	sf.limit = 1*1024*1024;	//1 MB
	
	
	while(true)
	{
		rs = conn->scan(sk, ek, &sf);
		if(rs)
		{
			while(rs->hasNext())
			{
				printf("%.*s, %.*s\n", rs->getNextKey()->length, (char*)rs->getNextKey()->data, rs->getNextVal()->length, (char*)rs->getNextVal()->data);
				rs->moveNext();
			}
			
			if(!rs->moreDataToCome())
			{
				rs->clear();
				delete rs;
				break;
			}
				
			sk->free();
			delete sk;
			
			sk = rs->lastEvaluatedKey();
			
			rs->clear();
			delete rs;
		}
	}
	
	sk->free();
	ek->free();
	delete sk;
	delete ek;
	

same can be done with the char* keys, as follows;

	char *sk = new char[17];
	memcpy(sk, "partial start key", 17);
	
	char *ek = new char[19];
	memcpy(ek, "the partial end key", 19);
	
	resultset *rs = NULL;
	scan_filter sf;
	
	//let's override the default way of scanning
	sf.skey_op = GT;
	sf.ekey_op = LT;
	sf.limitby = LIMIT_RESULT_ROW;
	sf.limit = 1000;	//1000 rows
	
	while(true)
	{
		rs = conn->scan(sk, ek, &sf);
		if(rs)
		{
			while(rs->hasNext())
			{
				printf("%s, %s\n", rs->getNextKeyStr(), rs->getNextValStr());
				rs->moveNext();
			}
			
			if(!rs->moreDataToCome())
			{
				rs->clear();
				delete rs;
				break;
			}
				
			delete[] sk;
			
			FDT *lk = rs->lastEvaluatedKey();
			sk = (char*)lk->data;
			delete lk;
			
			rs->clear();
			delete rs;
		}
	}
	
	delete[] sk;
	delete[] ek;
	
	

same can be done with the long keys, as follows;

Note that to work with normal or wide table with long key then set the NORMAL_KEY_LONG as the key type in the table_env

	LONG_T sk = 10, ek = 20;
	
	resultset *rs = NULL;
	scan_filter sf;
	
	while(true)
	{
		rs = conn->scan(sk, ek, &sf);
		if(rs->count() != 11 || rs->count() != conn->count(10, 20))
			bangdb_logger("mismatch in count");
			
		while(rs->hasNext())
		{
			printf("key = %ld, val = %.*s\n", rs->getNextKeyLong(), rs->getNextVal()->length, (char*)rs->getNextVal()->data);
			rs->moveNext();
		}

		if(!rs->moreDataToCome())
		{
			rs->clear();
			delete rs;
			break;
		}

		sk = rs->lastEvaluatedKeyLong();
		rs->clear();
		delete rs;
	}
	
	

Note that we can do the scan with transaction as well

Range query between two arbitrary keys. These keys could be exact or partial keys

	
	byte[] sk = Encoding.UTF8.GetBytes("start partial key");
	byte[] ek = Encoding.UTF8.getBytes("the end partial key");
	byte[] key, val;
	
	ResultSet rs = null;
	ScanFilter sf = new ScanFilter();
	
	//let's override the default way of scanning
	sf.skeyOp = ScanOperator.GT;
	sf.ekeyOp = ScanOperator.LT;
	sf.limitBy = ScanLimitBy.LimitResultSizeByte;
	sf.limit = 1*1024*1024;	//1 MB
	
	
	while(true)
	{
		rs = conn.Scan(sk, ek, sf);
		if(rs != null)
		{
			while(rs.HasNext())
			{
				if(!rs.GetNextKey(out key))
					Console.Write("error in geting key\n");
				if(!rs.GetNextVal(out val))
					Console.Write("error in geting val\n");
				rs.MoveNext();
			}
			
			if(!rs.MoreDataToCome())
			{
				rs.Clear();
				break;
			}
				
			sk = rs.LastEvaluatedKey();
			
			rs.Clear();
			
			//it should be GT as the key has already been included
			//sf.skeyOp = ScanOperator.GT;
		}
	}

	

same can be done with the string keys, as follows;

	string sk = "start partial key";
	string ek = "the end partial key";
	byte[] key, val;
	
	ResultSet rs = null;
	ScanFilter sf = new ScanFilter();
	
	while(true)
	{
		rs = conn.Scan(sk, ek, sf);
		if(rs != null)
		{
			while(rs.HasNext())
			{
				if(!rs.GetNextKey(out key))
					Console.Write("error in geting key\n");
				if(!rs.GetNextVal(out val))
					Console.Write("error in geting val\n");
				rs.MoveNext();
			}
			
			if(!rs.MoreDataToCome())
			{
				rs.Clear();
				break;
			}
				
			sk = Encoding.Default.GetString(rs.LastEvaluatedKey());
			
			rs.Clear();
			
			//it should be GT as the key has already been included
			//sf.skeyOp = ScanOperator.GT;
		}
	}
	
	

same can be done with the long keys, as follows;

Note that to work with normal or wide table with long key then set the NormalKeyLong as the BangDBKeyType in the TableEnv

	long sk = 10;
	long ek = 20;
	byte[] key, val;
	
	ResultSet rs = null;
	ScanFilter sf = new ScanFilter();
	
	while(true)
	{
		rs = conn.Scan(sk, ek, sf);
		if(rs != null)
		{
			while(rs.HasNext())
			{
				long rk = rs.GetNextKeyLong();
					
				if(!rs.GetNextVal(out val))
					Console.Write("error in geting val\n");
				rs.MoveNext();
			}
			
			if(!rs.MoreDataToCome())
			{
				rs.Clear();
				break;
			}
				
			sk = rs.LastEvaluatedKeyLong();
			
			rs.Clear();
			
			//it should be GT as the key has already been included
			//sf.skeyOp = ScanOperator.GT;
		}
	}
	
	

Note that we can do the scan with transaction as well

Range query between two arbitrary keys. These keys could be exact or partial keys

	
	byte[] sk = "start partial key".getBytes();
	byte[] ek = "the end partial key".getBytes();
	byte[] key, val;
	
	ResultSet rs = null;
	ScanFilter sf = new ScanFilter();
	
	//let's override the default way of scanning
	sf.skeyOp = SCANOPERATOR.GT;
	sf.ekeyOp = SCANOPERATOR.LT;
	sf.limitBy = SCANLIMITBY.LIMIT_RESULT_SIZE;
	sf.limit = 1*1024*1024;	//1 MB
	
	while(true)
	{
		rs = conn.scan(sk, ek, sf);
		if(rs != null)
		{
			while(rs.hasNext())
			{
				rs.moveNext();
			}
			
			if(!rs.moreDataToCome())
			{
				rs.clear();
				break;
			}
				
			sk = rs.lastEvaluatedKey();
			
			rs.clear();
		}
	}
	
	

same can be done with the string keys, as follows;

	String sk = "start partial key");
	String ek = "the end partial key";
	byte[] key, val;
	
	ResultSet rs = null;
	ScanFilter sf = new ScanFilter();
	
	while(true)
	{
		rs = conn.scan(sk, ek, sf);
		if(rs != null)
		{
			while(rs.hasNext())
			{
				rs.moveNext();
			}
			
			if(!rs.moreDataToCome())
			{
				rs.clear();
				break;
			}
				
			sk = new String(rs.lastEvaluatedKey());
			
			rs.clear();
		}
	}

	

same can be done with the long keys, as follows;

	long sk = 10;
	long ek = 20;
	byte[] key, val;
	
	ResultSet rs = null;
	ScanFilter sf = new ScanFilter();
	
	while(true)
	{
		rs = conn.scan(sk, ek, sf);
		if(rs != null)
		{
			while(rs.hasNext())
			{
				rs.moveNext();
			}
			
			if(!rs.moreDataToCome())
			{
				rs.clear();
				break;
			}
				
			sk = rs.lastEvaluatedKeyLong();
			
			rs.clear();
		}
	}

	

Note that we can do the scan with transaction as well

scan is supported for both Exthash and Btree, however the range scan would make sense only for Btree index type because Btree stores the data in particular order whereas hash does not maintain any order.

Typical use of scan with Exthash

Here is how we can get all the keys and values from the db

	resultset *rs = NULL;
	scan_filter sf;
	FDT *sk = NULL;
	
	while(true)
	{
		rs = conn->scan(sk, NULL, &sf);
		if(rs)
		{
			while(rs->hasNext())
			{
				printf("%.*s, %.*s\n", rs->getNextKey()->length, (char*)rs->getNextKey()->data, rs->getNextVal()->length, (char*)rs->getNextVal()->data);
				rs->moveNext();
			}
			
			if(!rs->moreDataToCome())
			{
				rs->clear();
				delete rs;
				break;
			}
				
			if(sk)
				sk->free();
			delete sk;
			
			sk = rs->lastEvaluatedKey();
			
			rs->clear();
			delete rs;
		}
	}
	
	if(sk)
		sk->free();
	delete sk;
	

same can be done with the char* keys, as follows;

	char *sk = NULL;
	
	while(true)
	{
		rs = conn->scan(sk, NULL, &sf);
		if(rs)
		{
			while(rs->hasNext())
			{
				printf("%s, %s\n", rs->getNextKeyStr(), rs->getNextValStr());
				rs->moveNext();
			}
			
			if(!rs->moreDataToCome())
			{
				rs->clear();
				delete rs;
				break;
			}
				
			if(sk)
				delete[] sk;
			
			FDT *lk = rs->lastEvaluatedKey();
			sk = (char*)lk->data;
			delete lk;
			
			rs->clear();
			delete rs;
		}
	}

	delete rs;
	if(sk)
		delete[] sk;
	
	

Note that we can do the scan with transaction as well

Here is how we can get all the keys and values from the db

	Resultset rs = null;
	ScanFilter sf = new ScanFilter();
	byte[] sk = null;
	
	while(true)
	{
		rs = conn.Scan(sk, null, sf);
		if(rs != null)
		{
			while(rs.HasNext())
			{
				byte[] rk = rs.GetNextKey();
				byte[] vk = rs.GetNextVal();
				
				rs.MoveNext();
			}
			
			if(!rs.MoreDataToCome())
			{
				rs.Clear();
				break;
			}
			
			sk = rs.LastEvaluatedKey();
			
			rs.Clear();
			
			//it should be GT as the key has already been included
			//sf.skeyOp = ScanOperator.GT;
		}
	}
	
	

same can be done with the string keys, as follows;

	string sk = null;
	
	while(true)
	{
		rs = conn.Scan(sk, null, sf);
		if(rs != null)
		{
			while(rs.HasNext())
			{
				byte[] rk = rs.GetNextKey();
				byte[] vk = rs.GetNextVal();
				
				rs.MoveNext();
			}
			
			if(!rs.MoreDataToCome())
			{
				rs.Clear();
				break;
			}
			
			sk = Encoding.Default.GetString(rs.LastEvaluatedKey());
			
			rs.Clear();
			
			//it should be GT as the key has already been included
			//sf.skeyOp = ScanOperator.GT;
		}
	}
	
	

Note that we can do the scan with transaction as well

Here is how we can get all the keys and values from the db

	Resultset rs = null;
	ScanFilter sf = new ScanFilter();
	byte[] sk = null;
	
	while(true)
	{
		rs = conn.scan(sk, null, sf);
		if(rs != null)
		{
			while(rs.hasNext())
			{
				byte[] rk = rs.getNextKey();
				byte[] vk = rs.getNextVal();
				
				rs.moveNext();
			}
			
			if(!rs.moreDataToCome())
			{
				rs.clear();
				break;
			}
			
			sk = rs.lastEvaluatedKey();
			
			rs.clear();
		}
	}
	
	

same can be done with the string keys, as follows;

	String sk = null;
	
	while(true)
	{
		rs = conn.scan(sk, null, sf);
		if(rs != null)
		{
			while(rs.hasNext())
			{
				String rk = rs.getNextKeyStr();
				String vk = rs.getNextValStr();
				
				rs.moveNext();
			}
			
			if(!rs.moreDataToCome())
			{
				rs.clear();
				break;
			}
			
			sk = new String(rs.lastEvaluatedKey());
			
			rs.clear();
		}
	}
	
	

Note that we can do the scan with transaction as well

More range based query with scan, for ex; select * from table

	//get all records
	restutset *rs = NULL;
	scan_filter sf;
	
	FDT *sk = NULL;
	FDT *ek = NULL;
	
	while(true)
	{
		rs = conn->scan(sk, ek, &sf);

		if(rs)
		{
			printf("scanned %d items\n", rs->count());
			
			while(rs->hasNext())
			{
				FDT *rk = rs->getNextKey();
				FDT *vk = rs->getNextVal();
				printf("%.*s, %.*s\n", rk->length, (char*)rk->data, vk->length, (char*)vk->data);
				rs->moveNext();
			}
			
			if(!rs->moreDataToCome())
			{
				rs->clear();
				delete rs;
				break;
			}
			
			if(sk)
				sk->free();
			delete sk;
			
			sk = rs->lastEvaluatedKey();
			
			rs->clear();
			delete rs;
		}
	}
	
	if(sk)
		sk->free();
	if(ek)
		ek->free();
		
	delete sk;
	delete ek;
	

Similarly we can put sk as NULL and ek not NULL or vise versa

	//get all records
	ResultSet rs = null;
	ScanFilter sf = new ScanFilter();
	
	string sk = null;
	string ek = null;
	
	string key, val;
	
	while(true)
	{
		rs = conn.Scan(sk, ek, sf);

		if(rs != null)
		{
			Console.Write("scanned {0} items\n", rs.Count());
			
			while(rs.HasNext())
			{
				rs.GetNextKey(out key);
				rs.GetNextVal(out val);
				Console.Write("key = {0} and val = {1}\n", key, val);
				rs.MoveNext();
			}
			
			if(!rs.MoreDataToCome())
			{
				rs.Clear();
				break;
			}
				
			sk = rs.LastEvaluatedKey();
			
			rs.Clear();
			
			//it should be GT as the key has already been included
			//sf.skeyOp = ScanOperator.GT;
		}
	}
	
	

Similarly we can put sk as NULL and ek not NULL or vise versa

	//get all records
	ResultSet rs = null;
	ScanFilter sf = new ScanFilter();
	
	String sk = null;
	String ek = null;
	
	String key, val;
	
	while(true)
	{
		rs = conn.scan(sk, ek, sf);

		if(rs)
		{
			System.out.print("scanned "+ rs.Count() + "  items\n");
			
			while(rs.hasNext())
			{
				System.out.println("key = + rs.getNextKeyStr() + " and val = " rs.getNextValStr());
				rs.moveNext();
			}
			
			if(!rs.moreDataToCome())
			{
				rs.clear();
				break;
			}
				
			sk = new String(rs.lastEvaluatedKey());
			
			rs.clear();
		}
	}
	
	

Similarly we can put sk as NULL and ek not NULL or vise versa

Counting with scan, i.e. num of items in a table

	//to count number of items between two keys without fetching the records
	FDT *sk = new FDT("key1", 4);
	FDT *ek = new FDT("keyn", 4);
	
	FILEOFF_T ncount = conn->count(sk, ek);
	
	//or, number of items in the db
	ncount = conn->count();
	
	//or number of keys greater than a prticualr key(may be partial)
	ncount = conn->count(sk, NULL);
	
	//or less than a key
	ncount = conn->conn(NULL, ek);
	
	

Note that we can apply scan_filter here as well to apply GT/GTE LT/LTE constraint, but the limitby and limit is not applicable for count as it returns the result in one shot

	//to count number of items between two keys without fetching the records
	string sk = "key1";
	string ek = "keyn";
	
	long ncount = conn.Count(sk, ek);
	
	//or, number of items in the db
	ncount = conn.Count();
	
	//or number of keys greater than a prticualr key(may be partial)
	ncount = conn.Count(sk, null);
	
	//or less than a key
	ncount = conn.Conn(null, ek);
	
	

Note that we can apply scan_filter here as well to apply GT/GTE LT/LTE constraint, but the limitby and limit is not applicable for count as it returns the result in one shot

	//to count number of items between two keys without fetching the records
	String sk = "key1";
	String ek = "keyn";
	
	long ncount = conn.count(sk, ek, null);
	
	//or, number of items in the db
	ncount = conn.count();
	
	//or number of keys greater than a prticualr key(may be partial)
	ncount = conn.count(sk, null, null);
	
	//or less than a key
	ncount = conn.conn(null, ek, null);
	
	

Note that we can apply scan_filter here as well to apply GT/GTE LT/LTE constraint, but the limitby and limit is not applicable for count as it returns the result in one shot