mlflow-quick-start-inference(Python)

Loading...

MLflow tutorial: inference

This notebook shows how to load a model previously logged to MLflow and use it to make predictions on data in different formats. The notebook includes two examples of applying the model:

  • as a scikit-learn model to a pandas DataFrame
  • as a PySpark UDF to a Spark DataFrame

Requirements

  • If you are using a cluster running Databricks Runtime, you must install MLflow. See "Install a library on a cluster" (AWS|Azure|GCP). Select Library Source PyPI and enter mlflow in the Package field.
  • If you are using a cluster running Databricks Runtime ML, MLflow is already installed.

Prerequsite

  • This notebook uses the ElasticNet models from MLflow tutorial part 1: training and logging (AWS|Azure|GCP).

Find and copy the run ID of the run that created the model

Find and copy a run ID associated with an ElasticNet training run from the MLflow tutorial part 1: training and logging notebook. The run ID appears on the run details page; it is a 32-character alphanumeric string shown after the label "Run".

To navigate to the run details page for the MLflow tutorial part 1: training and logging notebook, open that notebook and click Experiment in the upper right corner. The Experiments sidebar displays. Do one of the following:

  • In the Experiments sidebar, click the icon at the far right of the date and time of the run. The run details page appears in a new tab.

  • Click the square icon with the arrow to the right of Experiment Runs. The Experiment page displays in a new tab. This page lists all of the runs associated with this notebook. To display the run details page for a particular run, click the link in the Start Time column for that run.

For more information, see "View notebook experiment" (AWS|Azure|GCP).

Load the model as a scikit-learn model

Use the MLflow API to load the model from the MLflow server that was created by the run. After loading the model, you can use just like you would any scikit-learn model.

/databricks/python/lib/python3.5/site-packages/sklearn/base.py:315: UserWarning: Trying to unpickle estimator ElasticNet from version 0.20.3 when using version 0.18.1. This might lead to breaking code or invalid results. Use at your own risk. UserWarning) Out[4]: array([ 7.80465607, -218.53130246, 547.79513025, 307.56466261, -467.74747431, 172.17777619, -68.46598014, 104.95635075, 655.65477181, 25.36356966])

Out[6]: array([ 209.99417003])

Create a PySpark UDF and use it for batch inference

In this section, you use the MLflow API to create a PySpark UDF from the model you saved to MLflow. For more information, see Export a python_function model as an Apache Spark UDF.

Saving the model as a PySpark UDF allows you to run the model to make predictions on a Spark DataFrame.

Use the Spark function withColumn() to apply the PySpark UDF to the DataFrame and return a new DataFrame with a prediction column.

Copied!
 
age
sex
bmi
bp
s1
s2
s3
s4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
0.0380759064334241
0.0506801187398187
0.0616962065186885
0.0218723549949558
-0.0442234984244464
-0.0348207628376986
-0.0434008456520269
-0.00259226199818282
-0.00188201652779104
-0.044641636506989
-0.0514740612388061
-0.0263278347173518
-0.00844872411121698
-0.019163339748222
0.0744115640787594
-0.0394933828740919
0.0852989062966783
0.0506801187398187
0.0444512133365941
-0.00567061055493425
-0.0455994512826475
-0.0341944659141195
-0.0323559322397657
-0.00259226199818282
-0.0890629393522603
-0.044641636506989
-0.0115950145052127
-0.0366564467985606
0.0121905687618
0.0249905933641021
-0.0360375700438527
0.0343088588777263
0.00538306037424807
-0.044641636506989
-0.0363846922044735
0.0218723549949558
0.00393485161259318
0.0155961395104161
0.0081420836051921
-0.00259226199818282
-0.0926954778032799
-0.044641636506989
-0.0406959404999971
-0.0194420933298793
-0.0689906498720667
-0.0792878444118122
0.0412768238419757
-0.076394503750001
-0.0454724779400257
0.0506801187398187
-0.0471628129432825
-0.015999222636143
-0.040095639849843
-0.0248000120604336
0.000778807997017968
-0.0394933828740919
0.063503675590561
0.0506801187398187
-0.00189470584028465
0.0666296740135272
0.0906198816792644
0.108914381123697
0.0228686348215404
0.0177033544835672
0.0417084448844436
0.0506801187398187
0.0616962065186885
-0.0400993174922969
-0.0139525355440215
0.00620168565673016
-0.0286742944356786
-0.00259226199818282
-0.0709002470971626
-0.044641636506989
0.0390621529671896
-0.0332135761048244
-0.0125765826858204
-0.034507614375909
-0.0249926566315915
-0.00259226199818282
-0.0963280162542995
-0.044641636506989
-0.0838084234552331
0.0081008722200108
-0.103389471327095
-0.0905611890362353
-0.0139477432193303
-0.076394503750001
0.0271782910803654
0.0506801187398187
0.0175059114895716
-0.0332135761048244
-0.00707277125301585
0.0459715403040008
-0.0654906724765493
0.0712099797536354
0.0162806757273067
-0.044641636506989
-0.0288400076873072
-0.00911348124867051
-0.00432086553661359
-0.00976888589453599
0.0449584616460628
-0.0394933828740919
0.00538306037424807
0.0506801187398187
-0.00189470584028465
0.0081008722200108
-0.00432086553661359
-0.0157187066685371
-0.0029028298070691
-0.00259226199818282
0.0453409833354632
-0.044641636506989
-0.0256065714656645
-0.0125563519424068
0.0176943801946045
-0.0000612835790604833
0.0817748396869335
-0.0394933828740919
-0.0527375548420648
0.0506801187398187
-0.0180618869484982
0.0804011567884723
0.0892439288210632
0.107661787276539
-0.0397192078479398
0.108111100629544
442 rows