Databricks Runtime 7.4 for Genomics (Unsupported)
Databricks released this image in November 2020.
Databricks Runtime 7.4 for Genomics is a version of Databricks Runtime 7.4 (Unsupported) optimized for working with genomic and biomedical data. It is a component of the Databricks Unified Analytics Platform for Genomics.
Note
Databricks Runtime for Genomics is deprecated. Databricks is no longer building new Databricks Runtime for Genomics releases and will remove support for Databricks Runtime for Genomics on September 24, 2022, when Databricks Runtime for Genomics 7.3 LTS support ends. At that point Databricks Runtime for Genomics will no longer be available for selection when you create a cluster. For more information about the Databricks Runtime deprecation policy and schedule, see Supported Databricks runtime releases and support schedule. Bioinformatics libraries that were part of the runtime have been released as Docker Containers, which you can find on the ProjectGlow Dockerhub page.
For more information, including instructions for creating a Databricks Runtime for Genomics cluster, see Databricks Runtime for Genomics (Deprecated). For more information on developing genomics applications, see Genomics guide.
New features
Databricks Runtime 7.4 for Genomics is built on top of Databricks Runtime 7.4. For information on what’s new in Databricks Runtime 7.4, see the Databricks Runtime 7.4 (Unsupported) release notes.
GloWGR for binary traits
GloWGR can now fit whole genome regression models for binary traits.
Logistic regression function accepts offset parameter
The logistic_regression_gwas
function now accepts an offset
parameter. This parameter is
equivalent to a feature with a fixed coefficient of 1
. Both the likelihood ratio test and Firth
penalized likelihood ratio test respect this parameter. The output of GloWGR should be passed as an
offset
.
Hail support
Databricks Runtime 7.4 for Genomics is the first release in the 7.x line to package support for Hail.
Improvements
GloWGR convenience functions
The RidgeRegression
and LogisticRegression
classes in GloWGR now provide a transform_loco
function to generate leave-one-chomosome-out (LOCO) predictions. In addition, GloWGR now includes a
reshape_for_gwas
function to reshape the predictions from GloWGR into a form that the
association tests in Glow can accept.