feature-store-time-series-example-uc(Python)

Loading...

Feature Store Time Series Feature Table

In this notebook, you create time series feature tables based on simulated Internet of Things (IoT) sensor data. You then do the following:

  • Generate a training set by performing a point-in-time lookup on the time series feature tables.
  • Use the training set to train a model.
  • Register the model.
  • Perform batch inference on new sensor data.

Requirements

Your workspace must be enabled for Unity Catalog. If your workspace is not enabled for Unity Catalog, use the version of this notebook for Workspace Feature Store (AWS | Azure | GCP).

Background

The data used in this notebook is simulated to represent the following situation: you have a series of readings from a set of IoT sensors installed in different rooms of a warehouse. You want to use this data to train a model that can detect when a person has entered a room. Each room has a temperature sensor, a light sensor, and a CO2 sensor, each of which records data at a different frequency.

Generate the simulated dataset

In this step, you generate the simulated dataset and then create four Spark DataFrames, one each for the light sensors, the temperature sensors, the CO2 sensors, and the ground truth.

Create the time series feature tables

In this step you create the time series feature tables. Each table uses the room as the primary key.

The time series feature tables are now visible in the Feature Store UI. The Timestamp Keys field is populated for these feature tables.

Updating the time-series feature tables

Suppose that after you create the feature table, you receive updated values. For example, maybe some temperature readings were incorrectly preprocessed and need to be updated in the temperature time series feature table.

    When you write a DataFrame to a time series feature table, the DataFrame must specify all the features of the feature table. To update a single feature column in the time series feature table, you must first join the updated feature column with other features in the table, specifying both a primary key and a timestamp key. Then, you can update the feature table.

    Create a training set with point-in-time lookups on time series feature tables

    In this step, you create a training set using the ground truth data by performing point-in-time lookups for the sensor data in the time series feature tables.

    The point-in-time lookup retrieves the latest sensor value as of the timestamp given by the ground truth data for the room given by the ground truth data.

      Train and register the model

      Score data with point-in-time lookups on time series feature tables

      The point-in-time lookup metadata provided to create the training set is packaged with the model so that the same lookup can be performed during scoring.