Migrate to Model Serving
This article demonstrates how to enable Model Serving on your workspace and switch your models from using Legacy MLflow Model Serving to the new Model Serving experience built on serverless compute.
Requirements
Registered model in the MLflow Model Registry.
Permissions on the registered models as described in the access control guide.
Significant changes
In Model Serving, the format of the request to the endpoint and the response from the endpoint are slightly different from Legacy MLflow Model Serving. See Scoring a model endpoint for details on the new format protocol.
In Model Serving, the endpoint URL includes
serving-endpoints
instead ofmodel
.Model Serving includes full support for managing resources with API workflows.
Model Serving is production-ready and backed by the Databricks SLA.
Enable Model Serving for your workspace
To use Model Serving, your account admin must read and accept the terms and conditions in the account console.
No additional steps are required to enable Model Serving in your workspace.
Migrate served models to Model Serving
You can create a Model Serving endpoint, and flexibly transition model serving workflows without disabling Legacy MLflow Model Serving.
The following steps show how to accomplish this with the UI. For each model on which you have Legacy MLflow Model Serving enabled:
Navigate to Serving endpoints on the sidebar of your machine learning workspace.
Follow the workflow described in UI workflow on how to create a serving endpoint with your model.
Transition your application to use the new URL provided by the serving endpoint to query the model, along with the new scoring format.
When your models are transitioned over, you can navigate to Models on the sidebar of your machine learning workspace.
Select the model for which you want to disable Legacy MLflow Model Serving.
On the Serving tab, select Stop.
A message appears to confirm. Select Stop Serving.