Databricks Runtime 7.2 for Genomics (Unsupported)

Databricks released this image in August 2020.

Databricks Runtime 7.2 for Genomics is a version of Databricks Runtime 7.2 (Unsupported) optimized for working with genomic and biomedical data. It is a component of the Databricks Unified Analytics Platform for Genomics.

For more information, including instructions for creating a Databricks Runtime for Genomics cluster, see Genomics guide. For more information on developing genomics applications, see Genomics guide.

New features

Databricks Runtime 7.2 for Genomics is built on top of Databricks Runtime 7.2. For information on what’s new in Databricks Runtime 7.2, see the Databricks Runtime 7.2 (Unsupported) release notes.

Improvements

Accelerated conversion of Numpy ndarray literals

Literal numpy 1D and 2D float-typed ndarrays are now converted to Java arrays significantly faster. The Glow genome-wide association study documentation reflects the usage.

Libraries

The following sections list the libraries included in Databricks Runtime 7.2 for Genomics that differ from those included in Databricks Runtime 7.2.

Packaged libraries

Library

Version

ADAM

0.32.0

GATK

4.1.4.1

Hadoop-bam

7.9.2

samtools

1.9

VEP

96