Databricks Runtime 6.5 for Genomics

Databricks released this image in April 2020.

Databricks Runtime 6.5 for Genomics is a variant of Databricks Runtime 6.5 optimized for working with genomic and biomedical data. It is a component of the Databricks Unified Analytics Platform for Genomics.

For more information, including instructions for creating a Databricks Runtime for Genomics cluster, see Databricks Runtime for Genomics. For more information on developing genomics applications, see Genomics.

New features

Databricks Runtime 6.5 for Genomics is built on top of Databricks Runtime 6.5. For information on what’s new in Databricks Runtime 6.5, see the Databricks Runtime 6.5 release notes.

Improvements

Prior to version 0.2.33, Hail had a bug that prevented users from running multiple Hail notebooks on the same cluster. We resolved this issue in open source and updated the installed version of Hail.

Libraries

The following libraries included in Databricks Runtime 6.5 for Genomics differ from those included in Databricks Runtime 6.5.

Library Version
ADAM 0.30.0
Hadoop-bam 7.9.2
Hail 0.2.33
GATK 4.0.11.0
samtools 1.9
VEP 96