Power BI cheat sheet
This page provides clear and opinionated guidance for efficiently managing your data in Power BI and Databricks to optimize query performance and create efficient dashboards.
Connect Databricks and Power BI
Best practice | Impact | Docs |
---|---|---|
Use Power BI parameters when connecting to different Databricks environments | Allows flexibility when connecting to different Databricks workspaces or different Databricks SQL warehouses. | |
Use Databricks publish to Power BI service functionality | Enables seamless catalog integration and data model sync without leaving the Databricks UI. | |
Use Databricks Automatic Publishing to Power BI | Publish datasets from Unity Catalog to Power BI directly from data pipelines. |
Choose the most appropriate storage mode
Best practice | Impact | Docs |
---|---|---|
Use DirectQuery for Fact tables and Dual for Dimension tables (not Import) | Generate more efficient SQL queries by using the most suitable storage mode. | |
Prefer DirectQuery over Import whenever possible | Allows you to maintain governance and audibility. | |
Use composite models for mixed storage modes | Allows mixed usage of DirectQuery, Dual, Import mode tables, and Aggregation and Hybrid tables. | |
Use hybrid tables for aggregated historical data with real-time data | Enables efficient in-memory queries. |
Optimize data access
Best practice | Impact | Docs |
---|---|---|
Use user-defined aggregations | Improves query performance over large DirectQuery semantic models by caching pre-aggregated data. | |
Use automatic aggregations | Continuously optimizes DirectQuery semantic models by building aggregations based on Query History for maximum report performance. | |
Use table partitioning or incremental refresh | Allows importing data faster and managing larger datasets, especially for very small, static and performance sensitive (less than 2 seconds) reports. | |
Add Apply all slicers and Clear all slicers buttons | Prevents unnecessary queries by leveraging query reduction settings when users interact with report filters. | |
Use Assume referential integrity when defining table relations if referential integrity has been validated in the upstream ingestion | Enables more efficient join strategies in SQL queries. |
Fine-tune your data model
Best practice | Impact | Docs |
---|---|---|
"Move left" transformations | SQL views leverage the power of the Databricks SQL engine for more efficient report execution compared to PowerQuery transformations and DAX formulas. | |
If you must use DAX formulas, optimize DAX formulas and avoid large result sets. | Prevents inefficient calculations that lead to deteriorated performance | |
Avoid DAX calculated columns and calculated tables in semantic models and define this data directly in your Gold tables | Precomputed measures perform best in the Gold layer | |
For DirectQuery, check for query parallelization configuration settings | Improves query parallelization and maximizes utilization of SQL warehouse to improve overall performance. | |
For DirectQuery, check how many queries Power BI can send in parallel to Databricks | Ensures the Databricks SQL warehouse is sized accordingly to handle the required level of parallelism to avoid queries being queued, resulting in a slow report. |
Monitor performance and metrics
Best practice | Impact | Docs |
---|---|---|
Use Power BI Performance Analyzer to examine report element performance | Identifies the visualization that takes the most time to load and where the bottleneck is. | |
Evaluate the following properties of Power BI semantic models:
| Fine-tunes the performance of the model. |