In the sections below, we provide guidance on installing MXNet on Databricks and give an example of running MXNet programs. See Integrating Deep Learning Libraries with Apache Spark for an example of integrating a deep learning library with Spark.
This guide is not a comprehensive guide on MXNet. See the MXNet website.
MXNet can be installed as a Databricks library from PyPI.
- For GPU machines, use a GPU-enabled MXNet installation corresponding to the CUDA version on your cluster; you can check the currently installed CUDA version by running
%sh cat /usr/local/cuda/version.txtin a notebook cell. For example, Databricks Runtime 4.0 and below contain CUDA 8.0 and so should be used with the mxnet-cu80 library. We strongly recommend using the GPU version, which is much more scalable.
- For CPU machines, use MXNet via the
mxnetCPU-specific PyPI library. We recommend using larger instance types if training fails for large datasets and models.
To test and migrate single-machine MXNet workflows, you can start with a driver-only cluster on Databricks by setting the number of workers to zero. Though Apache Spark is not functional under this setting, it is a cost-effective way to run single-machine MXNet workflows.