Databricks data engineering

Databricks data engineering features are a robust environment for collaboration among data scientists, data engineers, and data analysts. Data engineering tasks are also the backbone of Databricks machine learning solutions.

Note

If you are a data analyst who works primarily with SQL queries and BI tools, you might prefer Databricks SQL.

The data engineering documentation provides how-to guidance to help you get the most out of the Databricks collaborative analytics platform. For getting started tutorials and introductory information, see Get started: Account and workspace setup and What is Databricks?.

  • Delta Live Tables

    Learn how to build data pipelines for ingestion and transformation with Databricks Delta Live Tables.

  • Structured Streaming

    Learn about streaming, incremental, and real-time workloads powered by Structured Streaming on Databricks.

  • Apache Spark

    Learn how Apache Spark works on Databricks and the Databricks Lakehouse Platform.

  • Runtimes

    Learn about the types of Databricks runtimes and runtime contents.

  • Clusters

    Learn about Databricks clusters and how to create and manage them.

  • Notebooks

    Learn what a Databricks notebook is, and how to use and manage notebooks to process, analyze, and visualize your data.

  • Workflows

    Learn how to orchestrate data processing, machine learning, and data analysis workflows on the Databricks Lakehouse platform.

  • Storage

    Learn how Databricks uses cloud object storage and block storage volumes for persistent and ephemeral data storage.

  • Libraries

    Learn how to make third-party or custom code available in Databricks using libraries. Learn about the different modes for installing libraries on Databricks.

  • Repos

    Learn how to use Git to version control your notebooks and other files for development in Databricks.

  • DBFS

    Learn about Databricks File System (DBFS), a distributed file system mounted into a Databricks workspace and available on Databricks clusters

  • Files

    Learn about options for working with files on Databricks.

  • Migration

    Learn how to migrate data applications such as ETL jobs, enterprise data warehouses, ML, data science, and analytics to Databricks.

  • Optimization & performance

    Learn about optimizations and performance recommendations on Databricks.