MLflow quickstart (Scala)(Scala)

Loading...

MLflow quickstart: tracking

This is a notebook based on the MLflow quickstart example. This quickstart:

  • Creates an MLflow context and experiment
  • Starts an MLflow run
  • Logs parameters, metrics, and a file to the run

Setup

If you are running Databricks Runtime for Machine Learning, MLflow is already installed and no setup is required. If you are running Databricks Runtime, follow these steps to install the MLflow library.

  1. Create library with Source PyPI and enter mlflow.
  2. Create library with Source Maven and enter org.mlflow:mlflow-client:2.0.0.
  3. Install the libraries into the cluster.
import org.mlflow.tracking.ActiveRun
import org.mlflow.tracking.MlflowContext
import java.io.{File,PrintWriter}

Create MLflow context

val mlflowContext = new MlflowContext()

Get client and create experiment

val experimentName = "/Shared/Quickstart"
val client = mlflowContext.getClient()
val experimentOpt = client.getExperimentByName(experimentName);
if (!experimentOpt.isPresent()) {
  client.createExperiment(experimentName)
}
mlflowContext.setExperimentName(experimentName)

Use the MLflow Tracking API

Use the MLflow Tracking API to start a run and log parameters, metrics, and artifacts (files) from your data science code.

Start run and log parameters, metrics, and file

import java.nio.file.Paths
val run = mlflowContext.startRun("run")
// Log a parameter (key-value pair)
run.logParam("param1", "5")

// Log a metric; metrics can be updated throughout the run
run.logMetric("foo", 2.0, 1)
run.logMetric("foo", 4.0, 2)
run.logMetric("foo", 6.0, 3)

new PrintWriter("/tmp/output.txt") { write("Hello, world!") ; close }
run.logArtifact(Paths.get("/tmp/output.txt"))
run.endRun()