%md # MLflow example (Python) (MLflow 3.0) With MLflow's autologging capabilities, a single line of code automatically logs the resulting model, the parameters used to create the model, and model metrics. MLflow autologging is available for several widely used machine learning packages. This notebook creates a Random Forest model on a simple dataset and uses the the MLflow `autolog()` function to log information generated by the run. This tutorial leverages features from MLflow 3.0. For more details, see "Get started with MLflow 3.0" ([AWS](https://docs.databricks.com/aws/en/mlflow/mlflow-3-install)|[Azure](https://learn.microsoft.com/en-us/azure/databricks/mlflow/mlflow-3-install)|[GCP](https://docs.databricks.com/gcp/en/mlflow/mlflow-3-install)) For details about what information is logged with `autolog()`, refer to the [MLflow documentation](https://mlflow.org/docs/latest/index.html).
MLflow example (Python) (MLflow 3.0)
With MLflow's autologging capabilities, a single line of code automatically logs the resulting model, the parameters used to create the model, and model metrics. MLflow autologging is available for several widely used machine learning packages. This notebook creates a Random Forest model on a simple dataset and uses the the MLflow autolog()
function to log information generated by the run.
This tutorial leverages features from MLflow 3.0. For more details, see "Get started with MLflow 3.0" (AWS|Azure|GCP)
For details about what information is logged with autolog()
, refer to the MLflow documentation.
%md Install the mlflow library, upgrading to MLflow 3.0
Install the mlflow library, upgrading to MLflow 3.0
# Upgrade to the latest MLflow version to use MLflow 3.0 features %pip install mlflow>=3.0 --upgrade dbutils.library.restartPython()
%md Import the required libraries.
Import the required libraries.
import mlflow import mlflow.sklearn import pandas as pd import matplotlib.pyplot as plt from numpy import savetxt from sklearn.model_selection import train_test_split from sklearn.datasets import load_diabetes from sklearn.ensemble import RandomForestRegressor from sklearn.metrics import mean_squared_error
%md Import the dataset from scikit-learn and create the training and test datasets.
Import the dataset from scikit-learn and create the training and test datasets.
db = load_diabetes() X = db.data y = db.target X_train, X_test, y_train, y_test = train_test_split(X, y)
%md Create a random forest model and log parameters, metrics, and the model using `mlflow.sklearn.autolog()`.
Create a random forest model and log parameters, metrics, and the model using mlflow.sklearn.autolog()
.
# Enable autolog() mlflow.sklearn.autolog() # With autolog() enabled, a logged model is automatically created under the experiment # All parameters and metrics are automatically logged to both the model and run with mlflow.start_run(): # Set the model parameters. n_estimators = 100 max_depth = 6 max_features = 3 # Create and train model. rf = RandomForestRegressor(n_estimators = n_estimators, max_depth = max_depth, max_features = max_features) rf.fit(X_train, y_train) # Use the model to make predictions on the test dataset. predictions = rf.predict(X_test)
%md To view the results, click the **Experiments** icon <img src="https://docs.databricks.com/_static/images/icons/experiment.png"/> in the right sidebar. This sidebar displays the parameters and metrics for each run of this notebook. Click the name of the run to open the Runs page in a new tab. This page shows all of the information that was logged from the run. Select the **Artifacts** tab to find the plot. From the experiments page, switch to the **Models** tab to view the logged model that was created, along with all relevant metadata such as parameters and metrics. For more information, see "MLflow experiments" ([AWS](https://docs.databricks.com/aws/en/mlflow/experiments)|[Azure](https://learn.microsoft.com/en-us/azure/databricks/mlflow/experiments)|[GCP](https://docs.databricks.com/gcp/en/mlflow/experiments)).
To view the results, click the Experiments icon in the right sidebar. This sidebar displays the parameters and metrics for each run of this notebook.
Click the name of the run to open the Runs page in a new tab. This page shows all of the information that was logged from the run. Select the Artifacts tab to find the plot.
From the experiments page, switch to the Models tab to view the logged model that was created, along with all relevant metadata such as parameters and metrics.
For more information, see "MLflow experiments" (AWS|Azure|GCP).