Query performance insights

Preview

This feature is in Private Preview. To try it, reach out to your Databricks contact.

When queries run, Databricks might return insights that identify opportunities to improve performance. This page lists the supported insights and their meaning.

For a broader overview of performance best practices, review the Comprehensive Guide to Optimize Databricks, Spark and Delta Lake Workloads.

CONCURRENT_WRITE

Concurrent writes on the table cause conflicts that are automatically resolved or fail.
Recommendation: Review the delta history to identify concurrent writes and consider different scheduling to avoid conflicts.

COVERAGE_FILTER_KEYS_CLUSTERING

The table is clustered by one or more keys that aren't used in filtering during the table scan.
Recommendation: Determine which data subset you need for the desired outcome, then add filters on matching clustering keys to reduce bytes read.

COVERAGE_FILTER_KEYS_PARTITIONING

The table is partitioned by one or more keys that aren't used in filtering during the table scan.
Recommendation: Determine which data subset you need for the desired outcome, then add filters on matching partitioning keys to reduce bytes read.

COVERAGE_PHOTON

Photon can't accelerate the operation, so the standard runtime engine was used.
Recommendation: Review Photon limitations and consider adjusting the query to use a supported execution strategy for faster runtime.

COVERAGE_STATS_DELTA

Delta data skipping statistics are missing or incomplete for the table scan file filters, so the query uses in-file filtering. The following statistics statuses are possible:
- Full: Statistics are available for all filters.
- Partial: Statistics are available on a subset of filters.
- Unavailable: Statistics are not available on any filter.
- Unused: Statistics could not be used on a filter that converts the data type.
Recommendation: Collect Delta statistics to reduce the number of bytes read.

COVERAGE_STATS_OPTIMIZER

Cost-based optimizer statistics are missing or incomplete, so standard heuristics were used to generate the query plan.
Recommendation: Collect statistics to enable the optimizer to produce a better plan.

DATA_SKEW

Data is processed unevenly by available computing resources.
Recommendation: Review the distribution of the data, then salt keys or pre-aggregate the data.

EXPLODING_JOIN

Join is generating significantly more rows than it has read.
Recommendation: Determine which result subset is required, then update the join or reduce the number of input rows from both relations.

IO_THROTTLING

Cloud storage request was throttled by your cloud provider.
Recommendation: Contact your administrator to increase your cloud storage request limits with your cloud provider.

CONCURRENT_WRITE​

COVERAGE_FILTER_KEYS_CLUSTERING​

COVERAGE_FILTER_KEYS_PARTITIONING​

COVERAGE_PHOTON​

COVERAGE_STATS_DELTA​

COVERAGE_STATS_OPTIMIZER​

DATA_SKEW​

EXPLODING_JOIN​

IO_THROTTLING​