Feature Store workflow overview

This page gives an overview of how to use Databricks Feature Store in a machine learning workflow for both online and batch use cases.

The typical machine learning workflow using Feature Store follows this path:

  1. Write code to convert raw data into features and create a Spark DataFrame containing the desired features.

  2. Write the DataFrame as a feature table in Feature Store.

  3. Train a model using features from the feature store. When you do this, the model stores the specifications of features used for training. When the model is used for inference, it automatically joins features from the appropriate feature tables.

  4. Register model in Model Registry.

You can now use the model to make predictions on new data.

For batch use cases, the model automatically retrieves the features it needs from Feature Store.

Feature Store workflow for batch machine learning use cases.

For real-time serving use cases, publish the features to an online store.

At inference time, the model reads pre-computed features from the online feature store and joins them with the data provided in the client request to the model serving endpoint.

Feature Store flow for machine learning models that are served.

Example notebooks

The Basic Feature Store example notebook steps you through how to create a feature store table, use it to train a model, and then perform batch scoring using automatic feature lookup. It also introduces you to the Feature Store UI and shows how you can use it to search for features and understand how features are created and used.

Basic Feature Store example notebook

Open notebook in new tab

The Feature Store taxi example notebook illustrates the process of creating features, updating them, and using them for model training and batch inference.

Feature Store taxi example notebook

Open notebook in new tab