%md ## Load data from one MLflow experiment
The example below shows one way to get the experiment ID if you know the name of your experiment. Another way to get the ID is by copying it from the MLflow UI in the top left corner.
Load data from one MLflow experiment
The example below shows one way to get the experiment ID if you know the name of your experiment. Another way to get the ID is by copying it from the MLflow UI in the top left corner.
import org.mlflow.tracking.MlflowClient
val mlflow = new MlflowClient()
val expId = mlflow.getExperimentByName("/Path/To/Experiment").get.getExperimentId
val df = spark.read.format("mlflow-experiment").load(expId)
display(df.limit(10))
run_id
experiment_id
metrics
params
tags
start_time
end_time
status
artifact_uri
1
2
3
4
5
6
7
8
9
10
c1c0e7d1bbc249c09835401155f4765f
20140755
{"loss": 0.6954526045029831, "test_error": 0.7800881082874644, "training_time": 7.561332101675567}
{"num_epochs": "86", "learning_rate": "0.20090270823367942", "conv_padding": "0", "momentum": "0.16225679924702008", "num_hidden": "20", "conv_channels": "2", "batch_size": "100"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:22.490+0000
2019-08-22T18:23:22.651+0000
FINISHED
dbfs:/databricks/mlflow/20140755/c1c0e7d1bbc249c09835401155f4765f/artifacts
6791fc4412da4c20a5a3c9e183f8cfad
20140755
{"loss": 0.6366780869546103, "test_error": 0.8010344620776254, "training_time": 6.903556828202785}
{"num_epochs": "93", "learning_rate": "0.23238648558926767", "conv_padding": "2", "momentum": "0.20853432549128026", "num_hidden": "30", "conv_channels": "2", "batch_size": "10000"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:22.267+0000
2019-08-22T18:23:22.432+0000
FINISHED
dbfs:/databricks/mlflow/20140755/6791fc4412da4c20a5a3c9e183f8cfad/artifacts
bbdb5afb8cef44c786dac511304990d8
20140755
{"loss": 1.0473492161968698, "test_error": 0.6082114972057182, "training_time": 11.584912050479517}
{"num_epochs": "51", "learning_rate": "0.14701821164614007", "conv_padding": "1", "momentum": "0.32928204989427123", "num_hidden": "1", "conv_channels": "3", "batch_size": "10000"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:22.051+0000
2019-08-22T18:23:22.215+0000
FINISHED
dbfs:/databricks/mlflow/20140755/bbdb5afb8cef44c786dac511304990d8/artifacts
f8c0d3e368284c4c86d2571a67a86af1
20140755
{"loss": 0.9772858673761883, "test_error": 0.4047634614416513, "training_time": 13.206829052993097}
{"num_epochs": "50", "learning_rate": "0.17381878860827354", "conv_padding": "0", "momentum": "0.27182216848047513", "num_hidden": "20", "conv_channels": "2", "batch_size": "100"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:21.685+0000
2019-08-22T18:23:21.997+0000
FINISHED
dbfs:/databricks/mlflow/20140755/f8c0d3e368284c4c86d2571a67a86af1/artifacts
db23171bb2ca4de6bda6d29e2c51f200
20140755
{"loss": 0.8065881821571024, "test_error": 0.5636053974983563, "training_time": 12.96408892700816}
{"num_epochs": "73", "learning_rate": "0.19520985500855376", "conv_padding": "1", "momentum": "0.30560359967617856", "num_hidden": "10", "conv_channels": "2", "batch_size": "1"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:21.465+0000
2019-08-22T18:23:21.632+0000
FINISHED
dbfs:/databricks/mlflow/20140755/db23171bb2ca4de6bda6d29e2c51f200/artifacts
70e0ebc167e044fc86be33132255973d
20140755
{"loss": 1.0768625066529565, "test_error": 0.33629481893201657, "training_time": 13.053462462917436}
{"num_epochs": "49", "learning_rate": "0.3099356071899357", "conv_padding": "0", "momentum": "0.2563297563439655", "num_hidden": "30", "conv_channels": "2", "batch_size": "100"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:21.231+0000
2019-08-22T18:23:21.408+0000
FINISHED
dbfs:/databricks/mlflow/20140755/70e0ebc167e044fc86be33132255973d/artifacts
02a8e08bd15d4f3ebb092c4d2cf374d0
20140755
{"loss": 0.7416381919680847, "test_error": 0.46737939759620434, "training_time": 10.480126647592625}
{"num_epochs": "77", "learning_rate": "0.40534734198200545", "conv_padding": "2", "momentum": "0.286220326486637", "num_hidden": "5", "conv_channels": "3", "batch_size": "10000"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:21.014+0000
2019-08-22T18:23:21.181+0000
FINISHED
dbfs:/databricks/mlflow/20140755/02a8e08bd15d4f3ebb092c4d2cf374d0/artifacts
951bc435e64a430386676deaa7959774
20140755
{"loss": 2.2609429064642796, "test_error": 0.5676039482789375, "training_time": 12.223123654191227}
{"num_epochs": "15", "learning_rate": "0.29259139393996003", "conv_padding": "2", "momentum": "0.18125525309714324", "num_hidden": "20", "conv_channels": "3", "batch_size": "100"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:20.803+0000
2019-08-22T18:23:20.963+0000
FINISHED
dbfs:/databricks/mlflow/20140755/951bc435e64a430386676deaa7959774/artifacts
7b490865ea0b4788a54c3d393c56babc
20140755
{"loss": 1.18167277597225, "test_error": 0.477520133333075, "training_time": 15.022051598654514}
{"num_epochs": "42", "learning_rate": "0.263710317384402", "conv_padding": "2", "momentum": "0.24384707391547691", "num_hidden": "30", "conv_channels": "1", "batch_size": "1"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:20.590+0000
2019-08-22T18:23:20.751+0000
FINISHED
dbfs:/databricks/mlflow/20140755/7b490865ea0b4788a54c3d393c56babc/artifacts
3e906b3e1abc4bac97f6d425bef83caa
20140755
{"loss": 1.7239454714849651, "test_error": 0.12697243370781294, "training_time": 15.121500105742223}
{"num_epochs": "22", "learning_rate": "0.17395775389382317", "conv_padding": "0", "momentum": "0.44553782256021857", "num_hidden": "20", "conv_channels": "3", "batch_size": "100"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:20.370+0000
2019-08-22T18:23:20.540+0000
FINISHED
dbfs:/databricks/mlflow/20140755/3e906b3e1abc4bac97f6d425bef83caa/artifacts
Showing all 10 rows.
Command took 2.40 seconds
val df = spark.read.format("mlflow-experiment").load("18747865,20140755")
display(df.limit(10))
run_id
experiment_id
metrics
params
tags
start_time
end_time
status
artifact_uri
1
2
3
4
5
6
7
8
9
10
c1c0e7d1bbc249c09835401155f4765f
20140755
{"loss": 0.6954526045029831, "test_error": 0.7800881082874644, "training_time": 7.561332101675567}
{"num_epochs": "86", "learning_rate": "0.20090270823367942", "conv_padding": "0", "momentum": "0.16225679924702008", "num_hidden": "20", "conv_channels": "2", "batch_size": "100"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:22.490+0000
2019-08-22T18:23:22.651+0000
FINISHED
dbfs:/databricks/mlflow/20140755/c1c0e7d1bbc249c09835401155f4765f/artifacts
6791fc4412da4c20a5a3c9e183f8cfad
20140755
{"loss": 0.6366780869546103, "test_error": 0.8010344620776254, "training_time": 6.903556828202785}
{"num_epochs": "93", "learning_rate": "0.23238648558926767", "conv_padding": "2", "momentum": "0.20853432549128026", "num_hidden": "30", "conv_channels": "2", "batch_size": "10000"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:22.267+0000
2019-08-22T18:23:22.432+0000
FINISHED
dbfs:/databricks/mlflow/20140755/6791fc4412da4c20a5a3c9e183f8cfad/artifacts
bbdb5afb8cef44c786dac511304990d8
20140755
{"loss": 1.0473492161968698, "test_error": 0.6082114972057182, "training_time": 11.584912050479517}
{"num_epochs": "51", "learning_rate": "0.14701821164614007", "conv_padding": "1", "momentum": "0.32928204989427123", "num_hidden": "1", "conv_channels": "3", "batch_size": "10000"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:22.051+0000
2019-08-22T18:23:22.215+0000
FINISHED
dbfs:/databricks/mlflow/20140755/bbdb5afb8cef44c786dac511304990d8/artifacts
f8c0d3e368284c4c86d2571a67a86af1
20140755
{"loss": 0.9772858673761883, "test_error": 0.4047634614416513, "training_time": 13.206829052993097}
{"num_epochs": "50", "learning_rate": "0.17381878860827354", "conv_padding": "0", "momentum": "0.27182216848047513", "num_hidden": "20", "conv_channels": "2", "batch_size": "100"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:21.685+0000
2019-08-22T18:23:21.997+0000
FINISHED
dbfs:/databricks/mlflow/20140755/f8c0d3e368284c4c86d2571a67a86af1/artifacts
db23171bb2ca4de6bda6d29e2c51f200
20140755
{"loss": 0.8065881821571024, "test_error": 0.5636053974983563, "training_time": 12.96408892700816}
{"num_epochs": "73", "learning_rate": "0.19520985500855376", "conv_padding": "1", "momentum": "0.30560359967617856", "num_hidden": "10", "conv_channels": "2", "batch_size": "1"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:21.465+0000
2019-08-22T18:23:21.632+0000
FINISHED
dbfs:/databricks/mlflow/20140755/db23171bb2ca4de6bda6d29e2c51f200/artifacts
70e0ebc167e044fc86be33132255973d
20140755
{"loss": 1.0768625066529565, "test_error": 0.33629481893201657, "training_time": 13.053462462917436}
{"num_epochs": "49", "learning_rate": "0.3099356071899357", "conv_padding": "0", "momentum": "0.2563297563439655", "num_hidden": "30", "conv_channels": "2", "batch_size": "100"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:21.231+0000
2019-08-22T18:23:21.408+0000
FINISHED
dbfs:/databricks/mlflow/20140755/70e0ebc167e044fc86be33132255973d/artifacts
02a8e08bd15d4f3ebb092c4d2cf374d0
20140755
{"loss": 0.7416381919680847, "test_error": 0.46737939759620434, "training_time": 10.480126647592625}
{"num_epochs": "77", "learning_rate": "0.40534734198200545", "conv_padding": "2", "momentum": "0.286220326486637", "num_hidden": "5", "conv_channels": "3", "batch_size": "10000"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:21.014+0000
2019-08-22T18:23:21.181+0000
FINISHED
dbfs:/databricks/mlflow/20140755/02a8e08bd15d4f3ebb092c4d2cf374d0/artifacts
951bc435e64a430386676deaa7959774
20140755
{"loss": 2.2609429064642796, "test_error": 0.5676039482789375, "training_time": 12.223123654191227}
{"num_epochs": "15", "learning_rate": "0.29259139393996003", "conv_padding": "2", "momentum": "0.18125525309714324", "num_hidden": "20", "conv_channels": "3", "batch_size": "100"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:20.803+0000
2019-08-22T18:23:20.963+0000
FINISHED
dbfs:/databricks/mlflow/20140755/951bc435e64a430386676deaa7959774/artifacts
7b490865ea0b4788a54c3d393c56babc
20140755
{"loss": 1.18167277597225, "test_error": 0.477520133333075, "training_time": 15.022051598654514}
{"num_epochs": "42", "learning_rate": "0.263710317384402", "conv_padding": "2", "momentum": "0.24384707391547691", "num_hidden": "30", "conv_channels": "1", "batch_size": "1"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:20.590+0000
2019-08-22T18:23:20.751+0000
FINISHED
dbfs:/databricks/mlflow/20140755/7b490865ea0b4788a54c3d393c56babc/artifacts
3e906b3e1abc4bac97f6d425bef83caa
20140755
{"loss": 1.7239454714849651, "test_error": 0.12697243370781294, "training_time": 15.121500105742223}
{"num_epochs": "22", "learning_rate": "0.17395775389382317", "conv_padding": "0", "momentum": "0.44553782256021857", "num_hidden": "20", "conv_channels": "3", "batch_size": "100"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:20.370+0000
2019-08-22T18:23:20.540+0000
FINISHED
dbfs:/databricks/mlflow/20140755/3e906b3e1abc4bac97f6d425bef83caa/artifacts
Showing all 10 rows.
Command took 3.87 seconds
val df = spark.read.format("mlflow-experiment").load("20140755")
val filtered_df = df.filter("metrics.loss < 1.85 AND params.num_epochs > '30'")
display(filtered_df.limit(10))
run_id
experiment_id
metrics
params
tags
start_time
end_time
status
artifact_uri
1
2
3
4
5
6
7
8
9
10
c1c0e7d1bbc249c09835401155f4765f
20140755
{"loss": 0.6954526045029831, "test_error": 0.7800881082874644, "training_time": 7.561332101675567}
{"num_epochs": "86", "learning_rate": "0.20090270823367942", "conv_padding": "0", "momentum": "0.16225679924702008", "num_hidden": "20", "conv_channels": "2", "batch_size": "100"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:22.490+0000
2019-08-22T18:23:22.651+0000
FINISHED
dbfs:/databricks/mlflow/20140755/c1c0e7d1bbc249c09835401155f4765f/artifacts
6791fc4412da4c20a5a3c9e183f8cfad
20140755
{"loss": 0.6366780869546103, "test_error": 0.8010344620776254, "training_time": 6.903556828202785}
{"num_epochs": "93", "learning_rate": "0.23238648558926767", "conv_padding": "2", "momentum": "0.20853432549128026", "num_hidden": "30", "conv_channels": "2", "batch_size": "10000"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:22.267+0000
2019-08-22T18:23:22.432+0000
FINISHED
dbfs:/databricks/mlflow/20140755/6791fc4412da4c20a5a3c9e183f8cfad/artifacts
bbdb5afb8cef44c786dac511304990d8
20140755
{"loss": 1.0473492161968698, "test_error": 0.6082114972057182, "training_time": 11.584912050479517}
{"num_epochs": "51", "learning_rate": "0.14701821164614007", "conv_padding": "1", "momentum": "0.32928204989427123", "num_hidden": "1", "conv_channels": "3", "batch_size": "10000"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:22.051+0000
2019-08-22T18:23:22.215+0000
FINISHED
dbfs:/databricks/mlflow/20140755/bbdb5afb8cef44c786dac511304990d8/artifacts
f8c0d3e368284c4c86d2571a67a86af1
20140755
{"loss": 0.9772858673761883, "test_error": 0.4047634614416513, "training_time": 13.206829052993097}
{"num_epochs": "50", "learning_rate": "0.17381878860827354", "conv_padding": "0", "momentum": "0.27182216848047513", "num_hidden": "20", "conv_channels": "2", "batch_size": "100"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:21.685+0000
2019-08-22T18:23:21.997+0000
FINISHED
dbfs:/databricks/mlflow/20140755/f8c0d3e368284c4c86d2571a67a86af1/artifacts
db23171bb2ca4de6bda6d29e2c51f200
20140755
{"loss": 0.8065881821571024, "test_error": 0.5636053974983563, "training_time": 12.96408892700816}
{"num_epochs": "73", "learning_rate": "0.19520985500855376", "conv_padding": "1", "momentum": "0.30560359967617856", "num_hidden": "10", "conv_channels": "2", "batch_size": "1"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:21.465+0000
2019-08-22T18:23:21.632+0000
FINISHED
dbfs:/databricks/mlflow/20140755/db23171bb2ca4de6bda6d29e2c51f200/artifacts
70e0ebc167e044fc86be33132255973d
20140755
{"loss": 1.0768625066529565, "test_error": 0.33629481893201657, "training_time": 13.053462462917436}
{"num_epochs": "49", "learning_rate": "0.3099356071899357", "conv_padding": "0", "momentum": "0.2563297563439655", "num_hidden": "30", "conv_channels": "2", "batch_size": "100"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:21.231+0000
2019-08-22T18:23:21.408+0000
FINISHED
dbfs:/databricks/mlflow/20140755/70e0ebc167e044fc86be33132255973d/artifacts
02a8e08bd15d4f3ebb092c4d2cf374d0
20140755
{"loss": 0.7416381919680847, "test_error": 0.46737939759620434, "training_time": 10.480126647592625}
{"num_epochs": "77", "learning_rate": "0.40534734198200545", "conv_padding": "2", "momentum": "0.286220326486637", "num_hidden": "5", "conv_channels": "3", "batch_size": "10000"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:21.014+0000
2019-08-22T18:23:21.181+0000
FINISHED
dbfs:/databricks/mlflow/20140755/02a8e08bd15d4f3ebb092c4d2cf374d0/artifacts
7b490865ea0b4788a54c3d393c56babc
20140755
{"loss": 1.18167277597225, "test_error": 0.477520133333075, "training_time": 15.022051598654514}
{"num_epochs": "42", "learning_rate": "0.263710317384402", "conv_padding": "2", "momentum": "0.24384707391547691", "num_hidden": "30", "conv_channels": "1", "batch_size": "1"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:20.590+0000
2019-08-22T18:23:20.751+0000
FINISHED
dbfs:/databricks/mlflow/20140755/7b490865ea0b4788a54c3d393c56babc/artifacts
f83979baa54e4b82a986db3d48a90164
20140755
{"loss": 0.7286995736762022, "test_error": 1.8769657368556045, "training_time": 27.068653870516165}
{"num_epochs": "80", "learning_rate": "0.35057962220405914", "conv_padding": "0", "momentum": "0.4856573299997328", "num_hidden": "10", "conv_channels": "2", "batch_size": "1000"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:20.148+0000
2019-08-22T18:23:20.311+0000
FINISHED
dbfs:/databricks/mlflow/20140755/f83979baa54e4b82a986db3d48a90164/artifacts
84e41eb1d04a43d38676877c9022a84d
20140755
{"loss": 0.6694215646095303, "test_error": 1.01865532175304, "training_time": 64.75969586188216}
{"num_epochs": "96", "learning_rate": "0.21796211702298549", "conv_padding": "0", "momentum": "0.4278636905195375", "num_hidden": "20", "conv_channels": "1", "batch_size": "1"}
{"mlflow.user": "max.allen@databricks.com"}
2019-08-22T18:23:19.926+0000
2019-08-22T18:23:20.088+0000
FINISHED
dbfs:/databricks/mlflow/20140755/84e41eb1d04a43d38676877c9022a84d/artifacts
Showing all 10 rows.
Command took 0.69 seconds
import org.apache.spark.sql.functions._
val df = spark.read.format("mlflow-experiment").load("20140755")
val minLossPerEpoch = df.groupBy(col("params.num_epochs")).agg(min(col("metrics.loss")))
minLossPerEpoch.show
+----------+------------------+
|num_epochs| min(metrics.loss)|
+----------+------------------+
| 51|1.0473492161968698|
| 15| 2.230135294657667|
| 54|0.9262751666347602|
| 29|1.4718298267827752|
| 42|1.1723375915001313|
| 73|0.8065881821571024|
| 87|0.6918733253361838|
| 30|1.4279705057530956|
| 34| 1.340595214397698|
| 22|1.7239454714849651|
| 35|1.2816841696707588|
| 98|0.6182645670274245|
| 96|0.6694215646095303|
| 18|1.9876813251266316|
| 70|0.8420887228892802|
| 61|0.8406349459880575|
| 27|1.5818347229694985|
| 75| 0.744592756836128|
| 17|2.0887763473927006|
| 26|1.6532247724800186|
+----------+------------------+
only showing top 20 rows
import org.apache.spark.sql.functions._
df: org.apache.spark.sql.DataFrame = [run_id: string, experiment_id: string ... 7 more fields]
minLossPerEpoch: org.apache.spark.sql.DataFrame = [num_epochs: string, min(metrics.loss): double]
Command took 1.16 seconds
%md ## Display Spark DataFrame in charts
Use the `display()` function and pass in a DataFrame to get a table of the DataFrame's values.
Then, use the charts button to choose the kind of chart you want to use to visualize your data.
For finer-grained tuning, press the **Plot Options...** button to select the data to be used on each axis, and what additional aggregation needs to be done.
Display Spark DataFrame in charts
Use the display()
function and pass in a DataFrame to get a table of the DataFrame's values.
Then, use the charts button to choose the kind of chart you want to use to visualize your data.
For finer-grained tuning, press the Plot Options... button to select the data to be used on each axis, and what additional aggregation needs to be done.
Read from the MLflow experiment associated with the notebook
To verify whether there's a experiment associated with this notebook, click the Runs tab in the top right corner of the notebook. If runs exist, then there's an experiment associated with the notebook.