mlflow-logging-api-quick-start-python(Python)

Loading...

MLflow logging API example (Python)

This notebook illustrates how to use the MLflow logging API to start an MLflow run and log the model, model parameters, evaluation metrics, and other run artifacts to the run. The easiest way to get started using MLflow tracking with Python is to use the MLflow autolog() API. If you need more control over the metrics logged for each training run, or want to log additional artifacts such as tables or plots, you can use the mlflow.log_metric() and mlflow.log_artifact() APIs demonstrated in this notebook.

Setup

  • If you are using a cluster running Databricks Runtime, you must install the mlflow library from PyPI. See Cmd 3.
  • If you are using a cluster running Databricks Runtime ML, the mlflow library is already installed.

This notebook creates a Random Forest model on a simple dataset and uses the MLflow Tracking API to log the model and selected model parameters and metrics.

Install the mlflow library. This is required for Databricks Runtime clusters only. If you are using a cluster running Databricks Runtime ML, skip to Cmd 4.

Import the required libraries.

Import the dataset from scikit-learn and create the training and test datasets.

Create a random forest model and log the model, model parameters, evaluation metrics, and other artifacts using mlflow.log_param(), mlflow.log_metric(), mlflow.log_model(), and mlflow.log_artifact(). These functions let you control exactly which parameters and metrics are logged, and also let you log other artifacts of the run such as tables and plots.

To view the results, click the Experiments icon in the right sidebar. This sidebar displays the parameters and metrics for each run of this notebook.

Click the name of the run to open the Runs page in a new tab. This page shows all of the information that was logged from the run. Select the Artifacts tab to find the logged model and plot.

For more information, see "MLflow experiments" (AWS|Azure|GCP).