Databricks supports sharing models across multiple workspaces. For example, you can develop and log a model in your own workspace and then register it in a centralized model registry. This is useful when multiple teams share access to models or when your organization has multiple workspaces to handle the different stages of development.
In such situations, Databricks recommends creating a dedicated workspace to hold the centralized model registry with an account for each user who needs access. This includes data scientists who log and register models and production users who manage and deploy models, as shown in the example multi-workspace set-up in the figure.
Access to the centralized registry is controlled by tokens. Each user or script that needs access creates a personal access token in the centralized registry and copies that token into the secret manager of their local workspace. Each API request sent to the centralized registry workspace must include the access token; MLflow provides a simple mechanism to specify the secrets to be used when performing model registry operations.
Using a model registry across workspaces requires the MLflow Python client, release 1.11.0 or above.
This workflow is implemented from logic in the MLflow client. Ensure that the environment running the client has access to make network requests against the Databricks workspace containing the centralized model registry. A common restriction put on the registry workspace is an IP allow list, which can disallow connections from MLflow clients running in a cluster in another workspace.
In the model registry workspace, create an access token.
In the local workspace, create secrets to store the access token and the remote workspace information:
Create a secret scope:
databricks secrets create-scope --scope <scope>.
Pick a unique name for the target workspace, shown here as
<prefix>. Then create three secrets:
databricks secrets put --scope <scope> --key <prefix>-host: Enter the hostname of the model registry workspace. For example,
databricks secrets put --scope <scope> --key <prefix>-token: Enter the access token from the model registry workspace.
databricks secrets put --scope <scope> --key <prefix>-workspace-id: Enter the workspace ID for the model registry workspace which can be found in the URL of any page.
You may want to share the secret scope with other users, since there is a limit on the number of secret scopes per workspace.
Based on the secret scope and name prefix you created for the remote registry workspace, you can construct a registry URI of the form:
registry_uri = f'databricks://<scope>:<prefix>'
You can use the URI to specify a remote registry for fluent API methods by first calling:
Or, you can specify it explicitly when you instantiate an
client = MlflowClient(registry_uri=registry_uri)
The following workflows show examples of both approaches.
One way to register a model is to use the
mlflow.set_registry_uri(registry_uri) mlflow.register_model(model_uri=f'runs:/<run_id>/<artifact_path>', name=model_name)
Examples for other model registration methods can be found in the notebook at the end of this page.
Registering a model in a remote workspace creates a temporary copy of the model artifacts in DBFS in the remote workspace. You may want to delete this copy once the model version is in
READY status. The temporary files can be found under the
You can also specify a
tracking_uri to point to a MLflow Tracking service in another workspace in a similar manner to
registry_uri. This means you can take a run on a remote workspace and register its model in the current or another remote workspace.
You can load and use a model version in a remote registry with
mlflow.<flavor>.load_model methods by first setting the registry URI:
mlflow.set_registry_uri(registry_uri) model = mlflow.pyfunc.load_model(f'models:/<model_name>/Staging') model.predict(...)
Or, you can explicitly specify the remote registry in the
model = mlflow.pyfunc.load_model(f'models://<scope>:<prefix>@databricks/<model_name>/Staging') model.predict(...)
Other helper methods for accessing the model files are also supported, such as:
client.get_latest_versions(model_name) client.get_model_version_download_uri(model_name, version)
You can perform any action on models in the remote registry as long as you have the required permissions. For example, if you have Can Manage permissions on a model, you can transition a model version stage or delete the model using
client = MlflowClient(tracking_uri=None, registry_uri=registry_uri) client.transition_model_version_stage(model_name, version, 'Archived') client.delete_registered_model(model_name)