Group by – BangDB = NoSQL + AI + Stream

Group by


Group by


This as name suggests, for gpby, which is always ready to be consumed. Every processing with the db for stream is always updated for every single event and this enables us to deal with continuously running run time queries

"gpby":[{"gpat":["a", "b"], "iatr":"c", "stat":1, "gran":3600, "kysz":48}, {"gpat":["a"], "iatr":"d", "stat":2, "gran":86400, "kysz":32}]
Let's look at each one now at a time;

{"gpat":["a", "b"], "iatr":"c", "stat":1, "gran":3600, "kysz":48}
This tells the db to groupby (a, b) for input attribute c, set the keysize as 48 (this is to specify max length of the gpby name) for granularity "gran": 3600 and finally "stat" dictates what to do here, for ex; "stat":1 means count. This allows db to keep computing gpby for c by (a, b) every hour.

For ex; if we replace a with category id, b with page id and c with visitor id, then this gpby scheme tells "count num of visitors groupby category and page in one hour" etc.

{"gpat":["a"], "iatr":"d", "stat":2, "gran":86400, "kysz":32}
This one is also similar except it tells do unique count (stat:2)

So for example, it could be unique visitor count every day etc.
Note: It's is self evident but good to emphasise that all these computations are happening at run time for every single event. This allows us to do run time continuous pattern identification (such as anomaly detection) etc. but also to return the data instantly when it comes to query as the post processing for data won't happen and data is already in ready to be consumed state. This is true for all other computations as well in BangDB