This notebook does not use distributed processing, so you can use the R install.packages()
function to install packages on the driver node only.
To take advantage of distributed processing, you must install packages on all nodes in the cluster by creating a cluster library. See Install a library on a cluster (AWS|Azure|GCP).
Import the required libraries.
This notebook uses the R library carrier
to serialize the predict method of the trained model, so that it can be loaded back into memory later. For more information, see the carrier
github repo.
To view the results, click the Experiments icon in the right sidebar. This sidebar displays the parameters and metrics for each run of this notebook.
Click the name of the run to open the Runs page in a new tab. This page shows all of the information that was logged from the run. Select the Artifacts tab to find the logged model and plot.
For more information, see "MLflow experiments" (AWS|Azure|GCP).
MLflow quickstart: tracking
This notebook creates a Random Forest model on a simple dataset and uses the MLflow Tracking API to log the model, selected model parameters and evaluation metrics, and other artifacts.