Skip to main content

Data warehousing on Databricks

Databricks SQL is a cloud data warehouse built on lakehouse architecture. It runs directly on your data lake, supports ANSI SQL with Delta Lake extensions, and provides the tools to build highly performant, cost-effective data warehouses without moving your data.

Interfaces and tools

Databricks SQL runs on SQL warehouses and is accessible from multiple interfaces for querying, visualization, pipeline management, and automation.

    • SQL editor
    • Write and run SQL queries with integrated AI assistance, code comments, and version history.
    • Notebooks
    • Run SQL alongside Python, Scala, or R by attaching a notebook to a SQL warehouse.
    • AI/BI
    • Create AI-powered dashboards and Genie spaces for self-service data analysis and conversational data exploration.
    • Metric views
    • Define reusable business metrics with consistent calculations using a semantic layer.
    • Alerts
    • Monitor query results, evaluate conditions, and deliver notifications automatically.
    • Jobs
    • Schedule SQL queries for automated data processing and reporting workflows.
    • ETL
    • Define and refresh streaming tables and materialized views directly in Databricks SQL for incremental ETL pipelines.
    • REST API
    • Automate and manage Databricks SQL objects programmatically.

Monitor and optimize

    • Query history
    • Review past query runs, execution times, and resource usage across your warehouse.
    • Query profile
    • Inspect the execution plan for a query to identify bottlenecks and optimization opportunities.

Get started

If you're new to Databricks SQL, start with the concepts and then follow a hands-on walkthrough.

Reference