Member-only story
Cloud-native Hello World for Bioinformatics
PART FIVE — Implementing scalable workflows on the public cloud with Terra
In part one, I presented the general rationale for building quick start examples (or ‘Hello World’) for bioinformatics tools.
In part two, I reviewed one such example for the CSIRO Bioinformatics VariantSpark library, runnable via smart templates in the AWS Marketplace.
In part three, I covered how to convert a locally runnable example to a reusable cloud example, by working with Google Cloud Platform custom Virtual Machine images running on Google Compute Engine.
In part four, I explored work I am doing with a research group. I covered some customized training that I built to prepare researchers to work ‘cloud-natively’ using the Serverless SQL GCP BigQuery service.
In this part five, I will explore current work I am doing with The Broad Institute at MIT & Harvard and other research partners. In particular, I’ll be sharing work around adoption of their public cloud research platform Terra.bio via QuickStart examples.
What is Terra?
The Terra platform is part of the DataBiosphere project (architecture shown below). The vision for the DataBiosphere is as follows:
DataBiosphere is a vision for the next generation of biomedical research, a common desire to build an open, compatible, and secure approach to data within the larger research community.