Manage libraries with %conda
commands (legacy)
Important
This documentation has been retired and might not be updated. The products, services, or technologies mentioned in this content are no longer supported. See Notebook-scoped Python libraries.
Important
%conda
commands are deprecated, and are supported only for Databricks Runtime 7.3 LTS ML. Databricks recommends using %pip
for managing notebook-scoped libraries. If you require Python libraries that can only be installed using conda, you can use conda-based docker containers to pre-install the libraries you need.
Anaconda Inc. updated their terms of service for anaconda.org channels in September 2020. Based on the new terms of service you may require a commercial license if you rely on Anaconda’s packaging and distribution. See Anaconda Commercial Edition FAQ for more information. Your use of any Anaconda channels is governed by their terms of service.
As a result of this change, Databricks has removed the default channel configuration for the Conda package manager. This is a breaking change.
To install or update packages using the %conda
command, you must specify a channel using -c
. You must also update all usage of %conda install
and %sh conda install
to specify a channel using -c
. If you do not specify a channel, conda commands will fail with PackagesNotFoundError
.
The %conda
command is equivalent to the conda command and supports the same API with some restrictions noted below. The following sections contain examples of how to use %conda
commands to manage your environment. For more information on installing Python packages with conda
, see the conda install documentation.
Note that %conda
magic commands are not available on Databricks Runtime. They are only available on Databricks Runtime 7.3 LTS ML. Databricks recommends using pip
to install libraries. For more information, see Understanding conda and pip.
If you must use both %pip
and %conda
commands in a notebook, see Interactions between pip and conda commands.
Note
The following conda
commands are not supported when used with %conda
:
activate
create
init
run
env create
env remove
List the Python environment of a notebook
To show the Python environment associated with a notebook, use %conda list
:
%conda list
Interactions between pip
and conda
commands
To avoid conflicts, follow these guidelines when using pip
or conda
to install Python packages and libraries.
Libraries installed using the Libraries API or using the cluster UI are installed using
pip
. If any libraries have been installed from the API or the cluster UI, you should use only%pip
commands when installing notebook-scoped libraries.If you use notebook-scoped libraries on a cluster, init scripts run on that cluster can use either
conda
orpip
commands to install libraries. However, if the init script includespip
commands, use only%pip
commands in notebooks (not%conda
).It’s best to use either
pip
commands exclusively orconda
commands exclusively. If you must install some packages usingconda
and some usingpip
, run theconda
commands first, and then run thepip
commands. For more information, see Using Pip in a Conda Environment.
Frequently asked questions (FAQ)
How do libraries installed from the cluster UI/API interact with notebook-scoped libraries?
Libraries installed from the cluster UI or API are available to all notebooks on the cluster. These libraries are installed using pip
; therefore, if libraries are installed using the cluster UI, use only %pip
commands in notebooks.
How do libraries installed using an init script interact with notebook-scoped libraries?
Libraries installed using an init script are available to all notebooks on the cluster.
If you use notebook-scoped libraries on a cluster running Databricks Runtime ML, init scripts run on the cluster can use either conda
or pip
commands to install libraries. However, if the init script includes pip
commands, then use only %pip
commands in notebooks.
For example, this notebook code snippet generates a script that installs fast.ai packages on all the cluster nodes.
dbutils.fs.put("dbfs:/home/myScripts/fast.ai", "conda install -c pytorch -c fastai fastai -y", True)
Can I use %pip
and %conda
commands in R or Scala notebooks?
Yes, in a Python magic cell.
Can I use %sh pip
, !pip
, or pip
? What is the difference?
%sh
and !
execute a shell command in a notebook; the former is a Databricks auxiliary magic command while the latter is a feature of IPython. pip
is a shorthand for %pip
when automagic is enabled, which is the default in Databricks Python notebooks.
On Databricks Runtime 11.0 and above, %pip
, %sh pip
, and !pip
all install a library as a notebook-scoped Python library. On Databricks Runtime 10.4 LTS and below, Databricks recommends using only %pip
or pip
to install notebook-scoped libraries. The behavior of %sh pip
and !pip
is not consistent in Databricks Runtime 10.4 LTS and below.
Known issues
When you use
%conda env update
to update a notebook environment, the installation order of packages is not guaranteed. This can cause problems for thehorovod
package, which requires thattensorflow
andtorch
be installed beforehorovod
in order to usehorovod.tensorflow
orhorovod.torch
respectively. If this happens, uninstall thehorovod
package and reinstall it after ensuring that the dependencies are installed.On Databricks Runtime 9.1 LTS, notebook-scoped libraries are incompatible with batch streaming jobs. Databricks recommends using cluster libraries or the IPython kernel instead.