MXNet

MXNet is a deep learning framework developed by multiple universities and companies. MXNet employs an Apache 2.0 license.

In the sections below, we provide guidance on installing MXNet on Databricks and give an example of running MXNet programs. See Integrating Deep Learning Libraries with Apache Spark for an example of integrating a deep learning library with Spark.

Note

This guide is not a comprehensive guide on MXNet. Please also refer to the MXNet website.

Install MXNet

MXNet may be installed as a regular Databricks Library from PyPi. There are different releases for different types of machines:

  • For GPU machines, use CUDA 8.0-enabled MXNet via the mxnet-cu80 GPU-enabled PyPi library. We strongly recommend using the GPU version, which is much more scalable.
  • For CPU machines, use MXNet via the mxnet CPU-specific PyPi library. Warning: Since the CPU version is much less scalable than the GPU version, we recommend using larger instance types if training fails for large datasets and models.

See Libraries for more info on Databricks Libraries.

Note

Previous versions of MXNet required installation using an Init Script. This is no longer necessary!

For GPU installation, the latest version requires latest cuDNN NVIDIA’s GPU-accelerated library for Deep Neural Network. This might cause problems for other libraries and applications relying on CUDA. We strongly suggest installing with version 0.9.3 (using pip install mxnet-cu80==0.9.3a3).

Use MXNet on a single node

To test and migrate single-machine MXNet workflows, you can start with a driver-only cluster on Databricks by setting the number of workers to zero. Though Apache Spark is not functional under this setting, it is a cost-effective way to run single-machine MXNet workflows.