Share models across workspaces

Databricks supports sharing models across multiple workspaces. For example, you can develop and log a model in your own workspace and then register it in a centralized model registry. This is useful when multiple teams share access to models or when your organization has multiple workspaces to handle the different stages of development.

In such situations, Databricks recommends creating a dedicated workspace to hold the centralized model registry with an account for each user who needs access. This includes data scientists who log and register models and production users who manage and deploy models, as shown in the example multi-workspace set-up in the figure.

Multiple workspaces

Access to the centralized registry is controlled by tokens. Each user or script that needs access creates a personal access token in the centralized registry and copies that token into the secret manager of their local workspace. Each API request sent to the centralized registry workspace must include the access token; MLflow provides a simple mechanism to specify the secrets to be used when performing model registry operations.

All client and fluent API methods for model registry are supported for remote workspaces.

Requirements

Using a model registry across workspaces requires the MLflow Python client, release 1.11.0 or above.

Set up the API token for a remote registry

  1. In the model registry workspace, create an access token.
  2. In the local workspace, create secrets to store the access token and the remote workspace information:
    1. Create a secret scope: databricks secrets create-scope --scope <scope>.
    2. Pick a unique name for the target workspace, shown here as <prefix>. Then create three secrets:
      • databricks secrets put --scope <scope> --key <prefix>-host : Enter the hostname of the model registry workspace. For example, https://cust-success.cloud.databricks.com/.
      • databricks secrets put --scope <scope> --key <prefix>-token : Enter the access token from the model registry workspace.
      • databricks secrets put --scope <scope> --key <prefix>-workspace-id : Enter the workspace ID for the model registry workspace which can be found in the URL of any page.

Note

You may want to share the secret scope with other users, since there is a limit on the number of secret scopes per workspace.

Specify a remote registry

Based on the secret scope and name prefix you created for the remote registry workspace, you can construct a registry URI of the form:

registry_uri = f'databricks://<scope>:<prefix>'

You can use the URI to specify a remote registry for fluent API methods by first calling:

mlflow.set_registry_uri(registry_uri)

Or, you can specify it explicitly when you instantiate an MlflowClient:

client = MlflowClient(registry_uri=registry_uri)

The following workflows show examples of both approaches.

Register a model in the remote registry

One way to register a model is to use the mlflow.register_model API:

mlflow.set_registry_uri(registry_uri)
mlflow.register_model(model_uri=f'runs:/<run_id>/<artifact_path>', name=model_name)

Examples for other model registration methods can be found in the notebook at the end of this page.

Note

Registering a model in a remote workspace creates a temporary copy of the model artifacts in DBFS in the remote workspace. You may want to delete this copy once the model version is in READY status. The temporary files can be found under the /dbfs/databricks/mlflow/tmp-external-source/<run_id> folder.

You can also specify a tracking_uri to point to a MLflow Tracking service in another workspace in a similar manner to registry_uri. This means you can take a run on a remote workspace and register its model in the current or another remote workspace.

Use a model from the remote registry

You can load and use a model version in a remote registry with mlflow.<flavor>.load_model methods by first setting the registry URI:

mlflow.set_registry_uri(registry_uri)
model = mlflow.pyfunc.load_model(f'models:/<model_name>/Staging')
model.predict(...)

Or, you can explicitly specify the remote registry in the models:/ URI:

model = mlflow.pyfunc.load_model(f'models://<scope>:<prefix>@databricks/<model_name>/Staging')
model.predict(...)

Other helper methods for accessing the model files are also supported, such as:

client.get_latest_versions(model_name)
client.get_model_version_download_uri(model_name, version)

Manage a model in the remote registry

You can perform any action on models in the remote registry as long as you have the required permissions. For example, if you have Can Manage permissions on a model, you can transition a model version stage or delete the model using MlflowClient methods:

client = MlflowClient(tracking_uri=None, registry_uri=registry_uri)
client.transition_model_version_stage(model_name, version, 'Archived')
client.delete_registered_model(model_name)

Notebook

Centralized Model Registry example notebook

Open notebook in new tab