09
Aug

CEO Blog: Hey Vertica, Here are our Weaknesses

It came to my attention that we were getting hits on our website with the search term "infobright weaknesses" that were originating from you at Vertica. We could have saved you the trouble - you could have just called us and asked. We are pretty transparent in our dealings. So that said, I thought I would take the opportunity to openly respond to you, and anyone else who may be asking the same question. So here goes:

  • We don't do anything that looks like transactional, but that's not what columnar does. You already know that.
  • We are not designed to be an enterprise data warehouse. As such, if a customer needs to maintain hundreds of tables in a star or snowflake schema with lots of complex joins, we will not be a good choice. In fact, we have encountered several instances where prospects were looking for just that, and we pointed them to you, or Paraccel, or Greenplum, or Sybase IQ. While I see these various solutions having different strengths and weaknesses, all do claim to be engineered to support enterprise data warehouses.
  • We don't typically do all that well if the records inside the tables are updated other than occasionally. While we do support DML, we do not support in place updates, so things like inventory data and other dynamic information tends to be better suited for EDW tools.
  • We <currently> don't have any individual implementations that exceed around 50TB in a single instance. While we are used in applications that entail over a Petabyte, no single instance is over around 50TB. While we feel we have significant headroom beyond this, I know the Petabyte range is a particular strength for you as well as the others I mentioned. Again, we have steered, on occasion, prospects in your direction as a result.

So those are the main things. SInce you were asking about what we did not do well, I thought you might be equally interested in where we do very well....

We provide an ideal environment for storing, retrieving, and analyzing machine generated data. This means data like Web logs, event logs, call data records, sensor data, financial transactions, gaming data, etc. Now, you (all) provide a nice environment for this as well. I think for pre-planned queries, most of you can do as well, sometimes even better than us. Sometimes. But that is usually not as true for real investigative analytics, where ad hoc queries are involved. Not that you are bad there. Most of you are still pretty good. The thing is, we do what we do for a fraction of the operational cost that most of the alternatives can provide. That's really the key for us. Specifically:

  • We don't require the administrative burden for the setup and maintenance of the environment like most alternatives. No indexing or projections. No balancing or partitions. So the labor cost associated with establishing and maintaining Infobright is exceptionally low.
  • We don't require nearly as much storage because we get extremely aggressive disk compression. You claim extreme compression, and tout 5:1 to 10:1 compression on your website. Sybase claims 3:1 to 4:1, and Greenplum claims 3:1 to 10:1. And while I would argue that the architecture in place that corresponds to some of these claims make the claims themselves highly questionable, just taking the claims at face value still paints a picture far below what Infobright does. We provide (or "claim", to be consistent) between 10:1 and 40:1 compression. I have never, ever, encountered a customer who claimed to get better compression with an alternative. That said, all of the instances were customers with machine generated data, which is where we excel, and not coincidentally, where we focus. We have been told that the savings associated with the reduction in the hardware costs alone often pay for Infobright.
  • We have exceptionally fast load times. We had one customer say that Vertica was "very fast, much better than the rest". I replied, "So they were faster than us?", to which he said "No, just the rest".
  • We co-exist well. We have a great capability of interacting with Big Data elements like Hadoop and NoSQL variants. Some of you do as well.
  • We provide innovation for investigative analytics for Big Data. Much of this is tied to Rough Set math as well, but we are breaking new ground as people begin to figure out that HOW you analyze Big Data is often different that how you work with more traditional sources. We will do more and more here, which is exciting both us and our customers.

There are probably a few other weaknesses, and probably a few other strengths, but the list above really encapsulates where we are good, and where we are not. Our architecture, which is based on a very specific set of mathematics called "Granular Computing" designed to leverage a specific environment is both a strength and a weakness. If the business challenge is not in storing and retrieving and analyzing machine generated data, then it's a weakness. If it is in storing and retrieving and analyzing machine generated data, then it's a huge strength. It allows our customers to do a lot for a little. In a down economy where expenses are truly meaningful and scrutinized and the explosion of Big Data, and in particular, the explosion of machine generated data is unmistakable, we seem to make a lot of sense to a lot of organizations. You do as well. We are happy for your success. It's just different than ours.

Call us if we can answer any more questions.

Good luck.

Don

Infobright     Tags: vertica

In the case that you describe, we should indeed be a solid match. And with respect to the snowflake schema and complex joins, it has simply not been an item of great focus for us; we have preferred to focus on the wide fat schema that describe most machine generated data.

Author: Susan Davis
Date: 08/11/11

You say above, “if a customer needs to maintain hundreds of tables in a star or snowflake schema with lots of complex joins, we will not be a good choice.”

Do the “snowflake schema” and “lots of complex joins” cause the most loss of advantage?

Would a data warehouse presentation database with a pure star schema, no snowflakes, no complex joins, no inserts, no updates, and no deletes (i.e., only loads) retain the massive advantages of Infobright, even if it had hundreds of fact and dimension tables?

Author: Alan Musnikow
Date: 08/10/11

Please login or register to post a comment.