Skip to main content

Power BI cheat sheet

This page provides clear and opinionated guidance for efficiently managing your data in Power BI and Databricks to optimize query performance and create efficient dashboards.

Connect Databricks and Power BI

Best practice

Impact

Docs

Use Power BI parameters when connecting to different Databricks environments

Allows flexibility when connecting to different Databricks workspaces or different Databricks SQL warehouses.

Use Databricks publish to Power BI service functionality

Enables seamless catalog integration and data model sync without leaving the Databricks UI.

Use Databricks Automatic Publishing to Power BI

Publish datasets from Unity Catalog to Power BI directly from data pipelines.

Choose the most appropriate storage mode

Best practice

Impact

Docs

Use DirectQuery for Fact tables and Dual for Dimension tables (not Import)

Generate more efficient SQL queries by using the most suitable storage mode.

Prefer DirectQuery over Import whenever possible

Allows you to maintain governance and audibility.

Use composite models for mixed storage modes

Allows mixed usage of DirectQuery, Dual, Import mode tables, and Aggregation and Hybrid tables.

Use hybrid tables for aggregated historical data with real-time data

Enables efficient in-memory queries.

Optimize data access

Best practice

Impact

Docs

Use user-defined aggregations

Improves query performance over large DirectQuery semantic models by caching pre-aggregated data.

Use automatic aggregations

Continuously optimizes DirectQuery semantic models by building aggregations based on Query History for maximum report performance.

Use table partitioning or incremental refresh

Allows importing data faster and managing larger datasets, especially for very small, static and performance sensitive (less than 2 seconds) reports.

Add Apply all slicers and Clear all slicers buttons

Prevents unnecessary queries by leveraging query reduction settings when users interact with report filters.

Use Assume referential integrity when defining table relations if referential integrity has been validated in the upstream ingestion

Enables more efficient join strategies in SQL queries.

Fine-tune your data model

Best practice

Impact

Docs

"Move left" transformations

SQL views leverage the power of the Databricks SQL engine for more efficient report execution compared to PowerQuery transformations and DAX formulas.

If you must use DAX formulas, optimize DAX formulas and avoid large result sets.

Prevents inefficient calculations that lead to deteriorated performance

Avoid DAX calculated columns and calculated tables in semantic models and define this data directly in your Gold tables

Precomputed measures perform best in the Gold layer

For DirectQuery, check for query parallelization configuration settings

Improves query parallelization and maximizes utilization of SQL warehouse to improve overall performance.

For DirectQuery, check how many queries Power BI can send in parallel to Databricks

Ensures the Databricks SQL warehouse is sized accordingly to handle the required level of parallelism to avoid queries being queued, resulting in a slow report.

Monitor performance and metrics

Best practice

Impact

Docs

Use Power BI Performance Analyzer to examine report element performance

Identifies the visualization that takes the most time to load and where the bottleneck is.

Evaluate the following properties of Power BI semantic models:

  • Maximum connections per data source
  • Maximum number of simultaneous evaluations
  • Maximum number of concurrent jobs
  • MaxParallelismPerQuery

Fine-tunes the performance of the model.