Skip to main content

MLflow system tables reference

Preview

The MLflow system tables are in Public Preview.

The mlflow system tables capture experiment metadata managed within the MLflow tracking service. These tables allow privileged users to take advantage of Databricks lakehouse tooling on their MLflow data across all workspaces within the region. You can use the tables to build custom AI/BI dashboards, set up SQL alerts, or perform large-scale analytical queries.

Through the mlflow system tables, users can answer questions like:

  • Which experiments have the lowest reliability?
  • What is the average GPU utilization across different experiments?
note

The mlflow system tables began recording MLflow data from all regions on September 2, 2025. Data from before that date may not be available.

Available tables

The mlflow schema includes the following tables:

  • system.mlflow.experiments_latest: Records experiment names and soft-deletion events. This data is similar to the experiments page in the MLflow UI.
  • system.mlflow.runs_latest: Records run-lifecycle information, the params and tags associated with each run, and aggregated stats of min, max, and latest values of all metrics. This data is similar to the runs search or runs detail page.
  • system.mlflow.run_metrics_history: Records the name, value, timestamp, and step of all metrics logged on runs, which can be used to plot detailed timeseries from runs. This data is similar to the metrics tab on the runs detail page.

The following is an example of plotting run information using a dashboard:

run details dashboard

Table schemas

Below are the table schemas with descriptions and example data.

ER diagram

system.mlflow.experiments_latest

Column name

Data type

Description

Example

Nullable

account_id

string

The ID of the account containing the MLflow experiment

"bd59efba-4444-4444-443f-44444449203"

No

update_time

timestamp

The system time when the experiment was last updated

2024-06-27T00:58:57.000+00:00

No

delete_time

timestamp

The system time when the MLflow experiment was soft-deleted by the user

2024-07-02T12:42:59.000+00:00

Yes

experiment_id

string

The ID of the MLflow experiment

"2667956459304720"

No

workspace_id

string

The ID of the workspace containing the MLflow experiment

"6051921418418893"

No

name

string

User-provided name of the experiment

"/Users/first.last@databricks.com/myexperiment"

No

create_time

timestamp

The system time when the experiment was created

2024-06-27T00:58:57.000+00:00

No

system.mlflow.runs_latest

Column name

Data type

Description

Example

Nullable

account_id

string

The ID of the account containing the MLflow run

"bd59efba-4444-4444-443f-44444449203"

No

update_time

timestamp

The system time when the run was last updated

2024-06-27T00:58:57.000+00:00

No

delete_time

timestamp

The system time when the MLflow run was soft-deleted by the user

2024-07-02T12:42:59.000+00:00

Yes

workspace_id

string

The ID of the workspace containing the MLflow run

"6051921418418893"

No

run_id

string

The ID of the MLflow run

"7716d750d279487c95f64a75bff2ad56"

No

experiment_id

string

The ID of the MLflow experiment containing the MLflow run

"2667956459304720"

No

created_by

string

The name of the Databricks principal or user that created the MLflow run

"<user>@<domain-name>"

Yes

start_time

timestamp

The user-specified time when the MLflow run started

2024-06-27T00:58:57.000+00:00

No

end_time

timestamp

The user-specified time when the MLflow run ended

2024-07-02T12:42:59.000+00:00

Yes

run_name

string

The name of the MLflow run

"wistful-deer-932", "my-xgboost-training-run"

No

status

string

The execution status of the MLflow run

"FINISHED"

No

params

map<string, string>

Key-value parameters of the MLflow run

{"n_layers": "5", "batch_size": "64", "optimizer": "Adam"}

No

tags

map<string, string>

Key-value tags set on the MLflow run

{"ready_for_review": "true"}

No

aggregated_metrics

list<struct<string, double, double, double>>

An aggregated view summarizing the metrics in the run_metrics_history

[{"metric_name": "training_accuracy", "latest_value": 0.97, "min_value": 0.8, "max_value": 1.0}, ...]

No

aggregated_metrics.metric_name

string

The user-specified name of the metric

"training_accuracy"

No

aggregated_metrics.latest_value

double

The latest value of the metric_name in the time series of this (run, metric_name) combination in run_metrics_history

0.97

No

aggregated_metrics.max_value

double

The maximum value of the metric_name in the time series of this (run, metric_name) combination in run_metrics_history. If any NaN value was recorded for a metric, the value will be NaN

1.0

No

aggregated_metrics.min_value

double

The minimum value of the metric_name in the time series of this (run, metric_name) combination in run_metrics_history. If any NaN value was recorded for a metric, the value will be NaN

0.8

No

system.mlflow.run_metrics_history

Column name

Data type

Description

Example

Nullable

account_id

string

The ID of the account containing the MLflow run to which the metric was logged

"bd59efba-4444-4444-443f-44444449203"

No

insert_time

timestamp

The system time when the metric was inserted

2024-06-27T00:58:57.000+00:00

No

record_id

string

A unique identifier of the metric to distinguish between identical values

"Ae1mDT5gFMSUwb+UUTuXMQ=="

No

workspace_id

string

The ID of the workspace containing the MLflow run to which the metric was logged

"6051921418418893"

No

experiment_id

string

The ID of the MLflow experiment containing the MLflow run to which the metric was logged

"2667956459304720"

No

run_id

string

The ID of the MLflow run to which the metric was logged

"7716d750d279487c95f64a75bff2ad56"

No

metric_name

string

The metric name

"training_accuracy"

No

metric_time

timestamp

The user-specified time when the metric was computed

2024-06-27T00:55:54.1231+00:00

No

metric_step

bigint

The step (for example, epoch) of model training or agent development at which the metric was logged

10

No

metric_value

double

The metric value

0.97

No

Sharing access with users

By default, only account admins have access to system schemas. To give additional users access to the tables, an account admin must grant them the USE and SELECT permissions on the system.mlflow. schema. See Unity Catalog privileges and securable objects.

Any user who has access to these tables can view metadata across all MLflow experiments for all workspaces in the account. To configure table access for a given group rather than individual users, see Unity Catalog best practices.

If you require finer-grained control than granting all users access to the table, you can use dynamic views with custom criteria to grant groups certain access. For example, you could create a view that only shows records from a particular set of experiment IDs. After configuring a custom view, give the name of the view to your users so that they can query the dynamic view rather than the system table directly.

note

Example use cases of MLflow metadata

The following sections give examples of how you can use the MLflow system tables to answer questions about your MLflow experiments and runs.

Configure a SQL alert for low experiment reliability

Using Databricks SQL alerts (Beta), you can schedule a regularly recurring query and be notified if certain constraints are no longer met.

This example creates an alert that examines the most frequently run experiments within your workspace to determine whether they are experiencing low reliability and may need special attention. The query uses the runs_latest table to calculate the runs per experiment that are marked as finished, divided by the total number of runs.

note

The SQL Alerts feature is currently in Beta, and you can use legacy alerts as well.

  1. Click Alerts Icon Alerts in the sidebar and click Create Alert.

  2. Copy and paste the following query in the query editor.

    SQL
     SELECT
    experiment_id,
    AVG(CASE WHEN status = 'FINISHED' THEN 1.0 ELSE 0.0 END) AS success_ratio,
    COUNT(status) AS run_count
    FROM system.mlflow.runs_latest
    WHERE status IS NOT NULL
    GROUP BY experiment_id
    ORDER BY run_count DESC
    LIMIT 20;
  3. In the Condition field, set the conditions to MIN success_ratio < 0.9. This will trigger the alert if any of the top 20 experiments (by number of runs) has a success ratio less than 90%.

Additionally, you can test the condition, set a schedule, and configure notifications. For more information on configuring the alert, see setting up a SQL alert. Below is an example configuration using the query.

SQL alert configuration

Sample queries

You can use the following sample queries to get information about MLflow activity in your account using Databricks SQL. You can also leverage tools like Python notebooks with Spark.

Get run information from runs_latest

SQL
SELECT
run_name,
date(start_time) AS start_date,
status,
TIMESTAMPDIFF(MINUTE, start_time, end_time) AS run_length_minutes
FROM system.mlflow.runs_latest
WHERE
experiment_id = :experiment_id
AND run_id = :run_id
LIMIT 1

This returns information about the given run:

Query results run information

Get experiment and run information from experiments_latest and runs_latest

SQL
SELECT
runs.run_name,
experiments.name,
date(runs.start_time) AS start_date,
runs.status,
TIMESTAMPDIFF(MINUTE, runs.start_time, runs.end_time) AS run_length_minutes
FROM system.mlflow.runs_latest runs
JOIN system.mlflow.experiments_latest experiments ON runs.experiment_id = experiments.experiment_id
WHERE
runs.experiment_id = :experiment_id
AND runs.run_id = :run_id
LIMIT 1

Get summary statistics for a given run from run_metrics_history

SQL
SELECT
metric_name,
count(metric_time) AS num_data_points,
ROUND(avg(metric_value), 1) AS avg,
ROUND(max(metric_value), 1) AS max,
ROUND(min(metric_value), 1) AS min,
ROUND(PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY metric_value), 1) AS pct_25,
ROUND(PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY metric_value), 1) AS median,
ROUND(PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY metric_value), 1) AS pct_75
FROM
system.mlflow.run_metrics_history
WHERE
run_id = :run_id
GROUP BY
metric_name, run_id
LIMIT 100

This returns a summary of the metrics for the given run_id:

Query results run summary metrics

Dashboards for experiments and runs

You can build dashboards on top of MLflow system tables data to analyze your MLflow experiments and runs from the entire workspace.

For more details, see Build dashboards with MLflow metadata in system tables