J. Craig Venter Institute
The J. Craig Venter Institute (JCVI), formed in October 2006 through the merger of several affiliated and legacy scientific research organizations, is a world leader in genomic research. Its president and founder Dr. Venter is best known for the pioneering research he and his team did in decoding the first draft of the human genome. Since its founding, JCVI has been engaged in some of the most fruitful and exciting research in the biological sciences.
Overview download as pdf
One of the key initiatives is to unlock the secrets of the oceans by sampling, sequencing and analyzing the DNA of the microorganisms living in there. The Sorcerer II Global Ocean Sampling Expedition has already uncovered more than 40 million new genes and thousands of new protein families from organisms found in sea water. The team uses a metagenomic approach in which genomic analysis (the analysisof all DNA in an organism) is conducted on the entire community of microbes within the sample rather than isolating and culturing each individual microbial species. These samples are then analyzed against a database that contains previously decoded microbial genomes. Through this process, many new organisms are being identified, and their genomic information has become part of the ever-expanding database of known organisms. JCVI researchers are using this information to better understand the evolution of these organisms, and how they are related to each other. This information can also lead to advances in human health, climate and environmental remediation, agricultural improvements, and in other important areas.
Challenge
Not surprisingly, genomic sequencing and analysis requires capturing, storing and analyzing large data sets. To make the data accessible to researchers both within JCVI and the scientific community at large, JCVI developed APIS, the Automated Phylogenomic Inference System. APIS is updated as the genomes of new organisms are completed. APIS includes a number of database instances. Two public databases, previously stored in MySQL, were moved to Infobright to accommodate the growing number of users who wanted access to the data.
While it is still early in our use of Infobright, it has proven to be very efficient and easy to deploy.
- Michael Heaney, JCVI IT
Solution
Infobright, designed for fast queries against large data volumes, was an ideal choice for JCVI as it provided: - 17:1 data compression: One database was 433GB in MySQL, and only 25GB of space in Infobright. The second database was compressed from 112GB of data in MySQL to 7GB in Infobright. - Query speed improvements of 10x - Low administrative burden on IT: No indexes, no need to change schemas as the data moved from MySQL to Infobright, easy to manage. - Low cost: Today, Infobright is supporting an increasing number of researchers using a standard server, and running in a virtual machine with 2 CPUs and 8 GB RAM. Its high rate of compression means a significant decrease in the amount of storage needed to support the growing data volume.
We don’t need to do a lot of manipulation of the system to get the performance we need, and the compression and performance are big advantages.
- Michael Heaney, JCVI IT
download as pdf
Infopliance Portal
Next Steps
Customer Stories

TradeDoubler
“Each month we process and analyze data generated by 20 billion online transactions. We are pleased by Infobright’s performance and the fact that we now can get answers to questions…
A New Approach
The Analytic Data Warehouse
Traditional data warehouse products put a tremendous burden on IT in order to create and maintain an environment that will allow users to query against large volumes of data.





