On-demand features example notebook(Python)

Loading...

On-demand features - simple demo

This notebook trains and scores a model that uses on-demand features.

Requirements:

  • Databricks Runtime 13.2 ML or above
  • Cluster Access Mode: Single user (Assigned)

Helper functions and notebook variables

Setup

OK

OK

The Python UDF can be called from SQL. For example:

    Copied!
     
    main.on_demand_demo.age(TIMESTAMP '1992-10-09 00:00:00')
    1
    31
    1 row

    Create a TrainingSet with on-demand features

      Copied!
       
      date_of_birth
      age
      label
      1
      2
      3
      1959-02-10T00:00:00Z
      64
      true
      1990-06-23T00:00:00Z
      33
      false
      1992-10-09T00:00:00Z
      31
      true
      3 rows

      Log a simple model using the TrainingSet

      This example uses a hard-coded model for simplicity. In practice, you'll log a model trained on the generated TrainingSet.

      /databricks/python/lib/python3.10/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")
      Uploading artifacts: 0%| | 0/10 [00:00<?, ?it/s]
      2024/02/08 20:57:02 INFO mlflow.store.artifact.cloud_artifact_repo: The progress bar can be disabled by setting the environment variable MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR to false Successfully registered model 'main.on_demand_demo.simple_model_3293533912'.
      Downloading artifacts: 0%| | 0/10 [00:00<?, ?it/s]
      2024/02/08 20:57:06 INFO mlflow.store.artifact.artifact_repo: The progress bar can be disabled by setting the environment variable MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR to false
      Uploading artifacts: 0%| | 0/10 [00:00<?, ?it/s]
      2024/02/08 20:57:08 INFO mlflow.store.artifact.cloud_artifact_repo: The progress bar can be disabled by setting the environment variable MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR to false Created version '1' of model 'main.on_demand_demo.simple_model_3293533912'.

      Score the model using score_batch

      Downloading artifacts: 0%| | 0/10 [00:00<?, ?it/s]
      2024/02/08 20:57:23 INFO mlflow.store.artifact.artifact_repo: The progress bar can be disabled by setting the environment variable MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR to false
      Downloading artifacts: 0%| | 0/5 [00:00<?, ?it/s]
      2024/02/08 20:57:24 WARNING mlflow.pyfunc: Calling `spark_udf()` with `env_manager="local"` does not recreate the same environment that was used during training, which may lead to errors or inaccurate predictions. We recommend specifying `env_manager="conda"`, which automatically recreates the environment that was used to train the model and performs inference in the recreated environment.
      Downloading artifacts: 0%| | 0/1 [00:00<?, ?it/s]
      2024/02/08 20:57:25 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
      Copied!
       
      id
      date_of_birth
      age
      prediction
      1
      2
      4
      2010-08-01T00:00:00Z
      13
      false
      5
      1983-05-01T00:00:00Z
      40
      true
      2 rows

      Cleanup