Google Genomics

Google Cloud Compute for genomic data processing

Google Genomics is an effort to make use of the Cloud Compute platform to do cloud processing on massive data in parallel. The project page displays some simple applications of the Cloud platform

  1. Explore genetic variation interactively: Compare entire cohorts in seconds with SQL-like queries. Compute transition/transversion ratios, genome-wide association, allelic frequency and more.

  2. Process big genomic data easily Run batch analyses like principal component analysis and Hardy-Weinberg equilibrium on as many samples as you like, in minutes or hours, with just a little code.

  3. Use Google’s infrastructure and big data expertise: Store one genome or a million using Google Genomics and take advantage of the same infrastructure that powers Search, Maps, YouTube, Gmail and Drive.

  4. Support emerging global standards: Google Genomics is implementing the API defined by the Global Alliance for Genomics and Health for visualization, analysis and more. Compliant software can access Google Genomics, local servers, or any other implementation.

More documentation include sample code for big-data tools like Storm is available on the github