Example: Deploy and query a feature serving endpoint

This article shows how to deploy and query a feature serving endpoint in a step-by-step process. This article uses the Databricks SDK. Some steps can also be completed using the REST API or the Databricks UI and include references to the documentation for those methods.

In this example, you have a table of cities with their locations (latitude and longitude) and a recommender app that takes into account the user's current distance from those cities. Because the user's location changes constantly, the distance between the user and each city must be calculated at the time of inference. This tutorial illustrates how to perform those calculations with low latency using Databricks Online Feature Store and Databricks Feature Serving. For the full set of example code, see the example notebook.

Step 1. Create the source table

The source table contains precomputed feature values and can be any Delta table in Unity Catalog with a primary key. In this example, the table contains a list of cities with their latitude and longitude. The primary key is destination_id. Sample data is shown below.

name	destination_id (pk)	latitude	longitude
Nashville, Tennessee	0	36.162663	-86.7816
Honolulu, Hawaii	1	21.309885	-157.85814
Las Vegas, Nevada	2	36.171562	-115.1391
New York, New York	3	40.712776	-74.005974

Step 2. Create the online feature store

For details about Databricks Online Feature Stores, see Databricks Online Feature Stores.

Python
from databricks.feature_engineering import FeatureEngineeringClient

fe = FeatureEngineeringClient()

feature_table_name = f"{catalog_name}.{schema_name}.location_features"
function_name = f"{catalog_name}.{schema_name}.distance"


# Create the feature table
fe.create_table(
  name = feature_table_name,
  primary_keys="destination_id",
  df = destination_location_df,
  description = "Destination location features."
)


# Enable Change Data Feed to enable CONTINOUS and TRIGGERED publish modes

spark.sql(f"ALTER TABLE {feature_table_name} SET TBLPROPERTIES (delta.enableChangeDataFeed = 'true')")

# Create an online store with specified capacity
online_store_name = f"{username}-online-store"

fe.create_online_store(
    name=online_store_name,
    capacity="CU_2"  # Valid options: "CU_1", "CU_2", "CU_4", "CU_8"
)

# Wait until the state is AVAILABLE
online_store = fe.get_online_store(name=online_store_name)
online_store.state

# Publish the table

published_table = fe.publish_table(
    online_store=online_store,
    source_table_name=feature_table_name,
    online_table_name=online_table_name
)

Step 3. Create a function in Unity Catalog

In this example, the function calculates the distance between the destination (whose location does not change) and the user (whose location changes frequently and is not known until the time of inference).

Python
# Define the function. This function calculates the distance between two locations.
function_name = f"main.on_demand_demo.distance"

spark.sql(f"""
CREATE OR REPLACE FUNCTION {function_name}(latitude DOUBLE, longitude DOUBLE, user_latitude DOUBLE, user_longitude DOUBLE)
RETURNS DOUBLE
LANGUAGE PYTHON AS
$$
import math
lat1 = math.radians(latitude)
lon1 = math.radians(longitude)
lat2 = math.radians(user_latitude)
lon2 = math.radians(user_longitude)

# Earth's radius in kilometers
radius = 6371

# Haversine formula
dlat = lat2 - lat1
dlon = lon2 - lon1
a = math.sin(dlat/2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon/2)**2
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))
distance = radius * c

return distance
$$""")

Step 4. Create a feature spec in Unity Catalog

The feature spec specifies the features that the endpoint serves and their lookup keys. It also specifies any required functions to apply to the retrieved features with their bindings. For details, see Create a FeatureSpec.

Python
from databricks.feature_engineering import FeatureLookup, FeatureFunction, FeatureEngineeringClient

fe = FeatureEngineeringClient()

features=[
 FeatureLookup(
   table_name=feature_table_name,
   lookup_key="destination_id"
 ),
 FeatureFunction(
   udf_name=function_name,
   output_name="distance",
   input_bindings={
     "latitude": "latitude",
     "longitude": "longitude",
     "user_latitude": "user_latitude",
     "user_longitude": "user_longitude"
   },
 ),
]

feature_spec_name = f"main.on_demand_demo.travel_spec"

# The following code ignores errors raised if a feature_spec with the specified name already exists.
try:
 fe.create_feature_spec(name=feature_spec_name, features=features, exclude_columns=None)
except Exception as e:
 if "already exists" in str(e):
   pass
 else:
   raise e

Step 5. Create a feature serving endpoint

To create a feature serving endpoint, you can use the UI Create an endpoint, the REST API, or the Databricks SDK, shown here.

The feature serving endpoint takes the feature_spec that you created in Step 4 as a parameter.

Python
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import EndpointCoreConfigInput, ServedEntityInput

# Create endpoint
endpoint_name = "fse-location"

try:
 status = workspace.serving_endpoints.create_and_wait(
   name=endpoint_name,
   config = EndpointCoreConfigInput(
     served_entities=[
       ServedEntityInput(
         entity_name=feature_spec_name,
         scale_to_zero_enabled=True,
         workload_size="Small"
       )
     ]
   )
 )
 print(status)

# Get the status of the endpoint
status = workspace.serving_endpoints.get(name=endpoint_name)
print(status)

Step 6. Query the feature serving endpoint

When you query the endpoint, you provide the primary key and optionally any context data that the function uses. In this example, the function takes as input the user's current location (latitude and longitude). Because the user's location is constantly changing, it must be provided to the function at inference time as a context feature.

You can also query the endpoint using the UI Query an endpoint using the UI or the REST API.

For simplicity, this example only calculates the distance to two cities. A more realistic scenario might calculate the user's distance from each location in the feature table to determine which cities to recommend.

Python
import mlflow.deployments

client = mlflow.deployments.get_deploy_client("databricks")
response = client.predict(
   endpoint=endpoint_name,
   inputs={
       "dataframe_records": [
           {"destination_id": 1, "user_latitude": 37, "user_longitude": -122},
           {"destination_id": 2, "user_latitude": 37, "user_longitude": -122},
       ]
   },
)

pprint(response)

Example notebook

See this notebook for a complete illustration of the steps:

Feature Serving example notebook with online store

Open notebook in new tab

Additional information

For details about using the feature engineering Python API, see the reference documentation.

Step 1. Create the source table​

Step 2. Create the online feature store​

Step 3. Create a function in Unity Catalog​

Step 4. Create a feature spec in Unity Catalog​

Step 5. Create a feature serving endpoint​

Step 6. Query the feature serving endpoint​

Example notebook​