Send scoring requests to serving endpoints
In this article, you learn how to format scoring requests for your served model, and how to send those requests to the model serving endpoint. See Model serving with Databricks.
To score a deployed model, you can send a REST API request to the model URL or use the UI.
You can call a model by calling the API and score using this URI:
POST /serving-endpoints/{endpoint-name}/invocations
See Query individual models behind an endpoint for how to send requests for a specific model behind an endpoint.
Request format
Requests should be sent by constructing a JSON with one of the supported keys and a JSON object corresponding to the input format. The following is the recommended format.
The dataframe_split
format is a JSON-serialized Pandas Dataframe in the split
orientation.
{
"dataframe_split": [{
"index": [0, 1],
"columns": ["sepal length (cm)", "sepal width (cm)", "petal length (cm)", "petal width (cm)"],
"data": [[5.1, 3.5, 1.4, 0.2], [4.9, 3.0, 1.4, 0.2]]
}]
}
The dataframe_records
is JSON-serialized Pandas Dataframe in the records
orientation. Although a supported format, this use case is less common.
Note
This format does not guarantee the preservation of column ordering, and the split
format is preferred over the records
format.
{
"dataframe_records": [
{
"sepal length (cm)": 5.1,
"sepal width (cm)": 3.5,
"petal length (cm)": 1.4,
"petal width (cm)": 0.2
},
{
"sepal length (cm)": 4.9,
"sepal width (cm)": 3,
"petal length (cm)": 1.4,
"petal width (cm)": 0.2
},
{
"sepal length (cm)": 4.7,
"sepal width (cm)": 3.2,
"petal length (cm)": 1.3,
"petal width (cm)": 0.2
}
]
}
Tensors format
When your model expects tensors, like a Tensorflow or Pytorch model, there are two supported format options for sending requests: instances
and inputs
.
If you have multiple named tensors per row, then you have to have one of each tensor for every row.
instances
is a tensors-based format that accepts tensors in row format. Use this format if all the input tensors have the same 0-th dimension. Conceptually, each tensor in the instances list could be joined with the other tensors of the same name in the rest of the list to construct the full input tensor for the model, which would only be possible if all of the tensors have the same 0-th dimension.{"instances": [ 1, 2, 3 ]}
or
The following example shows how to specify multiple named tensors.
{ "instances": [ { "t1": "a", "t2": [1, 2, 3, 4, 5], "t3": [[1, 2], [3, 4], [5, 6]] }, { "t1": "b", "t2": [6, 7, 8, 9, 10], "t3": [[7, 8], [9, 10], [11, 12]] } ] }
inputs
send queries with tensors in columnar format. This request is different because there are actually a different number of tensor instances oft2
(3) thant1
andt3
, so it is not possible to represent this input in theinstances
format.{ "inputs": { "t1": ["a", "b"], "t2": [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]], "t3": [[[1, 2], [3, 4], [5, 6]], [[7, 8], [9, 10], [11, 12]]] } }
Response format
The response from the endpoint is in the following format. The output from your model is wrapped in a predictions
key. Concretely, the out is almost always a list, and frequently a list of numbers.
{
"predictions": [0,1,1,1,0]
}
Send scoring requests with the UI
Sending requests using the UI is the easiest and fastest way to test the model.
From the Serving endpoint page, select Query endpoint.
Insert the model input data in JSON format and click Send Request.
If the model has been logged with an input example, click Show Example to load the input example.
Send scoring requests with the API
You can send a scoring request through the REST API using standard Databricks authentication. The following examples demonstrate authentication using a personal access token.
Note
As a security best practice when you authenticate with automated tools, systems, scripts, and apps, Databricks recommends that you use OAuth tokens or personal access tokens belonging to service principals instead of workspace users. To create tokens for service principals, see Manage tokens for a service principal.
Given a MODEL_VERSION_URI
like https://<databricks-instance>/model/iris-classifier/Production/invocations
, where <databricks-instance>
is the name of your Databricks instance, and a Databricks REST API token called DATABRICKS_API_TOKEN
, the following are example snippets of how to score a served model.
Score a model accepting dataframe records input format.
curl -X POST -u token:$DATABRICKS_API_TOKEN $MODEL_VERSION_URI \
-H 'Content-Type: application/json' \
-d '{"dataframe_records": [
{
"sepal_length": 5.1,
"sepal_width": 3.5,
"petal_length": 1.4,
"petal_width": 0.2
}
]}'
Score a model accepting tensor inputs. Tensor inputs should be formatted as described in TensorFlow Serving’s API docs.
curl -X POST -u token:$DATABRICKS_API_TOKEN $MODEL_VERSION_URI \
-H 'Content-Type: application/json' \
-d '{"inputs": [[5.1, 3.5, 1.4, 0.2]]}'
import numpy as np
import pandas as pd
import requests
def create_tf_serving_json(data):
return {'inputs': {name: data[name].tolist() for name in data.keys()} if isinstance(data, dict) else data.tolist()}
def score_model(model_uri, databricks_token, data):
headers = {
"Authorization": f"Bearer {databricks_token}",
"Content-Type": "application/json",
}
data_json = json.dumps({'dataframe_records': data.to_dict(orient='records')}) if isinstance(data, pd.DataFrame) else create_tf_serving_json(data)
response = requests.request(method='POST', headers=headers, url=model_uri, json=data_json)
if response.status_code != 200:
raise Exception(f"Request failed with status {response.status_code}, {response.text}")
return response.json()
# Scoring a model that accepts pandas DataFrames
data = pd.DataFrame([{
"sepal_length": 5.1,
"sepal_width": 3.5,
"petal_length": 1.4,
"petal_width": 0.2
}])
score_model(MODEL_VERSION_URI, DATABRICKS_API_TOKEN, data)
# Scoring a model that accepts tensors
data = np.asarray([[5.1, 3.5, 1.4, 0.2]])
score_model(MODEL_VERSION_URI, DATABRICKS_API_TOKEN, data)
You can score a dataset in Power BI Desktop using the following steps:
Open dataset you want to score.
Go to Transform Data.
Right-click in the left panel and select Create New Query.
Go to View > Advanced Editor.
Replace the query body with the code snippet below, after filling in an appropriate
DATABRICKS_API_TOKEN
andMODEL_VERSION_URI
.(dataset as table ) as table => let call_predict = (dataset as table ) as list => let apiToken = DATABRICKS_API_TOKEN, modelUri = MODEL_VERSION_URI, responseList = Json.Document(Web.Contents(modelUri, [ Headers = [ #"Content-Type" = "application/json", #"Authorization" = Text.Format("Bearer #{0}", {apiToken}) ], Content = {"dataframe_records": Json.FromValue(dataset)} ] )) in responseList, predictionList = List.Combine(List.Transform(Table.Split(dataset, 256), (x) => call_predict(x))), predictionsTable = Table.FromList(predictionList, (x) => {x}, {"Prediction"}), datasetWithPrediction = Table.Join( Table.AddIndexColumn(predictionsTable, "index"), "index", Table.AddIndexColumn(dataset, "index"), "index") in datasetWithPrediction
Name the query with your desired model name.
Open the advanced query editor for your dataset and apply the model function.