Databricks Runtime for Health and Life Sciences

The Databricks Runtime for Health and Life Sciences (Datbricks Runtime HLS) is a version of the Databricks Runtime optimized for working with genomic and biomedical data. It is a component of the Unified Analytics Platform for Genomics.

Important

Databricks Runtime HLS is currently in Beta. Interfaces and pricing are subject to change before general availability. Sign up for access.

What’s inside?

  • A fast, scalable DNASeq pipeline
  • Spark SQL optimizations for common query patterns
  • Hail 0.2 integration
  • Popular open source libraries, optimized for performance and reliability
    • ADAM
    • GATK
    • Hadoop-bam
  • Reference data (grch37 or 38, known SNP sites)