Tracking Examples

Note

The notebooks assume that you have a /Shared/experiments folder.

  1. Go to the Shared folder. See Special folders.
  2. If you do not have an experiments subfolder, select Create > Folder.
  3. Enter experiments.
  4. Click Create Folder.

Train a scikit-learn model and save in scikit-learn format

This notebook is based on the MLflow tutorial. The notebook shows how to:

  • Install MLflow on a Databricks cluster
  • Train scikit-learn ElasticNet model on a diabetes dataset and log the training metrics, parameters, and model artifacts to a Databricks hosted tracking server
  • View the training results in the MLflow experiment UI

To learn how to deploy the trained model on AWS SageMaker, see scikit-learn model deployment on SageMaker.

Train a PyTorch model

PyTorch is a Python package that provides GPU accelerated tensor computation and high level functionalities for building deep learning networks.

The MLflow PyTorch notebook fits a neural network on MNIST handwritten digit recognition data. The run results are logged to an MLflow server. Training metrics and weights in TensorFlow event format are logged locally and then uploaded to the MLflow run’s artifact directory. Finally TensorBoard is started and reads the events logged locally.

This example runs on Overview of Databricks Runtime for Machine Learning and above. To install PyTorch on a cluster running Databricks Runtime 5.0 ML, run the PyTorch Init Script notebook to create an init script named pytorch-gpu-init.sh and configure your cluster with the pytorch-gpu-init.sh init script. If you run on Databricks Runtime 5.1 ML (Beta) or above, you do not need to create the PyTorch init script and configure your cluster with the script.

If you want to run TensorBoard to read the artifacts uploaded to S3, see How to run TensorFlow on S3.