CEO Blog: When Rough is Wicked Cool
We announced Infobright 4.0 today to what has been an exceptionally warm reception. We are quite proud of many of the advances in this very significant step forward for our company. Amid the fanfare around our "DomainExpert" technology and the huge leverage of our Hadoop integration (which is anything but a "me-too" offering), the significance of our "Rough Query" is worth noting. I would not say this is "cool", as that would be too much of an understatement. According to one of the preliminary discussions we had, "It's Wicked Cool!". Really. When we previewed this with our beta customers and analysts, they all seemed to pick up on it. So what is it?
Infobright is based on "Granular Computing" and "Rough-Set Mathematics". The two concepts are implemented to create our "Knowledge Grid architecture", which tends to work especially well with machine generated data, delivers extremely aggressive disk compression, and requires almost no administrative overhead. There is no setting up of projections or indexes, or tuning or balancing as you would have to do with traditional databases. This is a function of the Knowledge Grid architecture which automatically creates a metadata layer containing pointers to the physical data as well as information ABOUT the data. That information is used to speed up queries, sometimes eliminating the need to touch the data at all. Again, all of this is fundamentally based on granular computing and rough-set mathematics.
Our 4.0 release introduces the ability to perform "Rough Queries". It iis designed to get you from cold to warm to hot when searching within large datasets - without the tremendous overhead and time normally associated with getting there. This type of "investigative analytics" or data mining is generally a sequence of ad hoc querying, narrowing and filtering the data until an exact answer is determined. Almost all efforts to date have focused on how to get the exact answer faster. In many, many instances, the time to execute an investigative query against a very large dataset will be much longer than you want to wait. What Rough Query provides is the ability to drilldown into the data instantaneously to narrow the search. It does this by using the metadata in the Knowledge Grid only - and as it is in memory, the query returns in less than a second, It's like saying, I want to find James Smith in New York, and the result saying "OK, only look in SOHO or Tribeca". By ruling out 95% of the places to look, the search quickly hones in on the exact answer.
There is a much longer, much more technical explanation as to why this is the case. But the net result is that data mining, especially over large data sets, can be tremendously enhanced by this capability. And as one person we showed this to said, "That's wicked cool".