In the Databricks Runtime for Machine Learning, the Conda package manager is used to install Python packages.
All Python packages are installed inside a single environment. This environment is
/databricks/python2 on clusters using Python 2 or
/databricks/python3 on clusters using Python 3. Switching (or activating) Conda environments is not supported.
In this topic:
You can call the
conda command inside a notebook to install a Python package on the driver (master) node of a cluster running Databricks Runtime ML.
For some libraries you may need to detach and attach your notebook again before you can import a newly installed Python module.
%sh /databricks/conda/bin/conda install -y -p /databricks/python astropy
The easiest way to use Conda to install a package on all cluster nodes is to call
conda inside an init script.
In your init script, activate the default environment and install packages using
#!/bin/bash set -ex /databricks/python/bin/python -V . /databricks/conda/etc/profile.d/conda.sh conda activate /databricks/python conda install -y astropy