This documentation site provides how-to guidance and reference information for Databricks and Apache Spark.
This section shows how to get started with Databricks.
This section provides information about the languages supported in Databricks notebooks and jobs: Python, R, Scala, and SQL.
This section provides an overview of the variety of Databricks runtimes.
This section shows how to use a Databricks Workspace.
This section shows how to create and manage Databricks clusters.
This section shows how to use Databricks notebooks.
This section shows how to use Databricks jobs.
This section shows how to make third-party or custom libraries available to notebooks and jobs running on your clusters.
This section shows how to work with data in Databricks.
This section shows how to connect third-party tools, such as business intelligence (BI) tools and partner data sources, to Databricks.
This section shows how to use Delta Lake on Databricks.
- Delta Lake
- Delta Lake quickstart
- Introductory notebooks
- Ingest data into Delta Lake
- Table batch reads and writes
- Table streaming reads and writes
- Table deletes, updates, and merges
- Table utility commands
- Table versioning
- API reference
- Concurrency control
- Migration guide
- Best practices
- Frequently asked questions (FAQ)
This section shows how Delta Engine optimizations make Delta Lake operations highly performant.
These sections provide Apache Spark information using Databricks examples.
- DataFrames and Datasets
- Structured Streaming
These sections provide information about machine learning features supported by Databricks.
- Machine learning and deep learning
- Graph analysis
This section provides information about genomics application support in Databricks.
Developer tools help you develop Databricks applications using the Databricks REST API, Databricks Utilities, Databricks CLI, or tools outside the Databricks environment.
This section provides information about migrating workloads to Databricks.
These articles provide information about securing your Databricks infrastructure and data and ensuring that privacy requirements are met.
- Security and privacy
- Access control
- Secret management
- Credential passthrough
- Credential redaction
- Secure cluster connectivity
- Encrypt traffic between cluster worker nodes
- IP access lists
- Configure domain name firewall rules
- Best practices: GDPR and CCPA compliance using Delta Lake
- HIPAA-compliant deployment
- Best practices: Data governance on Databricks
This guide shows how to manage your Databricks account.
- Manage your Databricks account
- Create a new workspace using the Account API
- Use automation templates to create a new workspace using the Account API
- Access the Admin Console
- Manage users and groups
- Enable access control
- Manage workspace objects and behavior
- Manage cluster configuration options
- Manage AWS Infrastructure
- Plan capacity and control cost
This section provides information about new Databricks functionality.
This guide provides information about submitting and managing support tickets, as well as information about managing your Databricks Support contract.
- Log in to the Databricks Help Center
- Create a support case
- Update or respond to a support case
- Close a support case
- Update or respond to support cases opened by others
- Escalate a support case
- Update your profile
- The admin console
- Enable and create contacts (admin only)
- Set read and write access (admin only)
- Update your organization’s preferred timezone (admin only)
This guide provides information about submitting feature requests and other feedback using the Ideas Portal.
- Ideas Portal