Keras is a high-level deep learning framework originally developed as part of the research project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System) and now on Github as an open source project. Keras employs an MIT license.

Keras is a high-level API that calls into lower-level deep learning libraries. It supports TensorFlow, Theano, and CNTK.

In the sections below, we provide guidance on installing Keras on Databricks and give an example of running Keras programs. See Integrating Deep Learning Libraries with Apache Spark for an example of integrating a deep learning library with Spark.


This guide is not a comprehensive guide on Keras. Refer to the Keras website.

Install Keras


Keras is included in Databricks Runtime ML (Beta), a machine learning runtime that provides a ready-to-go environment for machine learning and data science. Instead of installing Keras using the instructions below, you can simply create a cluster using Databricks Runtime ML. See Databricks Runtime ML (Beta).

Keras can be installed as a Databricks library from PyPI. Use the keras PyPi library.

For TensorFlow versions 1.1 and higher, Keras is included within the TensorFlow package under tf.contrib.keras, hence using Keras by installing TensorFlow for TensorFlow-backed Keras workflows is a viable option. However, most existing documentation and tutorials assume Keras as a stand-alone package so it is often easier to work with the installation of Keras from PyPi.

Use Keras on a single node

To test and migrate single-machine Keras workflows, you can start with a driver-only cluster on Databricks by setting the number of workers to zero. Though Apache Spark is not functional under this setting, it is a cost-effective way to run single-machine Keras workflows.