Skip to main content

Power BI cheat sheet

This page provides clear and opinionated guidance for efficiently managing your data in Power BI and SAP Databricks to optimize query performance and create efficient dashboards.

For a set of practical quickstarts demonstrating reference implementations of some of the best practices for using Power BI on SAP Databricks, see this repository.

Connect SAP Databricks and Power BI

Best practice

Impact

Docs

Use Power BI parameters when connecting to different SAP Databricks environments

Allows flexibility when connecting to different SAP Databricks workspaces or different SAP Databricks SQL warehouses.

Use SAP Databricks publish to Power BI service functionality

Enables seamless catalog integration and data model sync without leaving the SAP Databricks UI.

Choose the most appropriate storage mode

Best practice

Impact

Docs

Use DirectQuery for Fact tables and Dual for Dimension tables (not Import)

Generate more efficient SQL queries by using the most suitable storage mode.

Prefer DirectQuery over Import whenever possible

Allows you to maintain governance and audibility.

Use composite models for mixed storage modes

Allows mixed usage of DirectQuery, Dual, Import mode tables, and Aggregation and Hybrid tables.

Use hybrid tables for aggregated historical data with real-time data

Enables efficient in-memory queries.

Optimize data access

Best practice

Impact

Docs

Use user-defined aggregations

Improves query performance over large DirectQuery semantic models by caching pre-aggregated data.

Use automatic aggregations

Continuously optimizes DirectQuery semantic models by building aggregations based on Query History for maximum report performance.

Use table partitioning or incremental refresh

Allows importing data faster and managing larger datasets, especially for very small, static, and performance-sensitive (less than 2 seconds) reports.

Add Apply all slicers and Clear all slicers buttons

Prevents unnecessary queries by leveraging query reduction settings when users interact with report filters.

Use Assume referential integrity when defining table relations if referential integrity has been validated in the upstream ingestion

Enables more efficient join strategies in SQL queries.

For DirectQuery, check for query parallelization configuration settings and the following properties of Power BI semantic models:

  • Maximum connections per data source
  • Maximum number of simultaneous evaluations
  • Maximum number of concurrent jobs
  • MaxParallelismPerQuery

Improves query parallelization and maximizes utilization of SQL warehouse to improve overall performance.

Fine-tune your data model

Best practice

Impact

Docs

"Move left" transformations

Push core business logic closer to data sources so data is higher quality, faster, and cheaper to use. SQL views leverage the power of the Databricks SQL engine for more efficient report execution compared to PowerQuery transformations and DAX formulas.

If you must use DAX formulas, optimize DAX formulas and avoid large result sets.

Prevents inefficient calculations that lead to deteriorated performance

Leverage calendar-based time intelligence

DirectQuery semantic models can execute time intelligence calculations far more efficiently, unlocking faster and more scalable reporting

Monitor performance and metrics

Best practice

Impact

Docs

Use Power BI Performance Analyzer to examine report element performance

Identifies the visualization that takes the most time to load and where the bottleneck is.