In this article, you learn how to include custom libraries or libraries from a private mirror server when you log your model, so that you can use them with Model Serving model deployments. You should complete the steps detailed in this guide after you have a trained ML model ready to deploy but before you create a Databricks Model Serving endpoint.
Model development often requires the use of custom Python libraries that contain functions for pre- or post-processing, custom model definitions, and other shared utilities. In addition, many enterprise security teams encourage the use of private PyPi mirrors, such as Nexus or Artifactory, to reduce the risk of supply-chain attacks. Databricks offers native support for installation of custom libraries and libraries from a private mirror in the Databricks workspace.
To ensure your library is available to your notebook, you need to install it using
%pip installs the library in the current notebook and downloads the dependency to the cluster.
The guidance in this section is not required if you install the private library by pointing to a custom PyPi mirror.
After you install the library and upload the Python wheel file to either Unity Catalog volumes or DBFS, include the following code in your script. In the
pip_requirements specify the path of your dependency file.
mlflow.sklearn.log_model(model, "sklearn-model", pip_requirements=["scikit-learn", "numpy", "/volume/path/to/dependency.whl"])
For DBFS, use the following:
mlflow.sklearn.log_model(model, "sklearn-model", pip_requirements=["scikit-learn", "numpy", "/dbfs/path/to/dependency.whl"])
If you have a custom library, you must manually specify all Python libraries associated with your model when you configure logging. You can do so with the
conda_env parameters in log_model().
If using DBFS, be sure to include a forward slash,
/, before your
dbfs path when logging
pip_requirements. Learn more about DBFS paths in How to work with files on Databricks.
from mlflow.utils.environment import _mlflow_conda_env conda_env = _mlflow_conda_env( additional_conda_deps= None, additional_pip_deps= ["/volumes/path/to/dependency"], additional_conda_channels=None, ) mlflow.pyfunc.log_model(..., conda_env = conda_env)
MLflow provides the add_libraries_to_model() utility to log your model with all of its dependencies pre-packaged as Python wheels. This packages your custom libraries alongside the model in addition to all other libraries that are specified as dependencies of your model. This guarantees that the libraries used by your model are exactly the ones accessible from your training environment.
In the following example,
model_uri references the model registry using the syntax
When you use the model registry URI, this utility generates a new version under your existing registered model.
import mlflow.models.utils mlflow.models.utils.add_libraries_to_model(<model-uri>)
When a new model version with the packages included is available in the model registry, you can add this model version to an endpoint with Model Serving.