Phase 7: Plan Infrastructure as Code approach
In this phase, you design your Infrastructure as Code (IaC) strategy to automate deployment and management of Databricks resources.
Infrastructure as Code tools are the recommended way of launching and managing Databricks workspaces. Each cloud has its own IaC tool, and there are also several cloud-agnostic tools.
Choose Infrastructure as Code tools
Terraform (recommended for lakehouse infrastructure)
Terraform is a third-party IaC tool that has large support across industries. It is the most popular third-party IaC tool, with an officially supported Terraform provider from Databricks. The Terraform provider documentation at terraform.io provides examples and guidance to help you get started quickly.
Terraform capabilities
- Manage Databricks account resources (for example, workspaces, networks, storage configurations).
- Manage workspace resources (for example, clusters, jobs, notebooks, Unity Catalog resources).
- Cloud-agnostic: Works across AWS, Azure, and GCP with the same language.
- Large ecosystem with modules and best practices.
- Official support from Databricks.
Terraform deployment patterns
- Account-level infrastructure: Workspaces, networks, storage credentials, metastores.
- Workspace-level configuration: Cluster policies, instance pools, workspace settings.
- Unity Catalog resources: Catalogs, schemas, external locations, storage credentials.
- Notebooks and repos: Deploy notebooks and Git repositories to workspaces.
- Secrets: Manage secrets in Databricks secret scopes.
Best practices for Terraform
- Use Terraform modules to create reusable patterns for your organization.
- Store Terraform state in remote backends (for example, S3, Azure Blob Storage, GCS) with state locking.
- Use Terraform workspaces or separate state files to isolate environments (for example, dev, staging, prod).
- Use Terraform variables and
tfvarsfiles to parameterize configurations. - Enable plan review in CI/CD pipelines before applying changes.
- Tag all resources consistently for cost attribution and governance.
- Document modules and maintain a module registry.
- Use the official Databricks Terraform provider examples as starting points.
For Terraform provider documentation and examples, see the Databricks Terraform Registry.
Declarative Automation Bundles (recommended for data and AI resources)
Declarative Automation Bundles is a first-party infrastructure-as-code tool from Databricks that enables consistent packaging and deployment of data and AI resources. It offers official support and integration with Databricks workspaces.
Declarative Automation Bundles capabilities
- Deploy jobs, pipelines, and notebooks.
- Deploy and manage ML models.
- Manage deployment across environments (for example, dev, staging, prod).
- Source control integration with Git.
- CI/CD integration with GitHub Actions, Azure DevOps, GitLab CI.
- Templating for common patterns.
Declarative Automation Bundles use cases
- Deploy data engineering workflows and Lakeflow Spark Declarative Pipelines pipelines.
- Deploy machine learning workflows and model serving endpoints.
- Deploy notebooks and SQL queries.
- Manage environment-specific configurations.
- Automate promotion from dev to staging to production.
Best practices for Declarative Automation Bundles
- Use Declarative Automation Bundles for data and AI workloads (for example, jobs, pipelines, models).
- Use Terraform for infrastructure resources (for example, workspaces, networks, Unity Catalog).
- Use Declarative Automation Bundles templates for common use cases.
- Integrate with CI/CD pipelines for automated deployments.
- Use environment-specific configurations for dev/staging/prod.
- Version control bundle definitions in Git.
For Declarative Automation Bundles documentation, see What are Declarative Automation Bundles?.
Cloud-native Infrastructure as Code tools
Each cloud provider offers native IaC tools for managing cloud resources. These tools can complement Terraform for cloud-specific infrastructure.
AWS CloudFormation
AWS uses AWS CloudFormation templates to automate the creation, update, and deletion of cloud resources such as S3 buckets, IAM roles, and VPCs.
CloudFormation capabilities
- Manage AWS infrastructure (for example, VPCs, S3 buckets, IAM roles).
- Define custom resources using AWS Lambda functions.
- Previously used to wrap Databricks Account API calls.
CloudFormation limitations
- This method, though used in the past to launch a workspace from the Databricks account console, is no longer the preferred method for launching a Databricks workspace.
- Use Terraform instead for Databricks workspace management.
Best practice: Use CloudFormation for AWS infrastructure (for example, VPCs, S3, IAM), but use Terraform for Databricks resources.
Design deployment approach
Subscription and account setup
Before executing any automation or manually creating any workspace, the first step is to subscribe to Databricks on your cloud provider.
AWS subscription
There are two ways to get started with Databricks on AWS:
You don't have an AWS account
Express setup lets you start using Databricks without prior cloud provider access. Sign up and start using Databricks right away. Express setup gives you a serverless workspace and free trial credits you can use to start exploring the Lakehouse platform. After your free trial credits are used or expired, you cannot use Databricks until you upgrade by adding a payment method. You then can create additional workspaces in your account.
You want to use your existing AWS account
You can sign up for a trial of Databricks in two ways (where you sign up determines how billing is handled after the free trial ends):
- Sign up through Databricks: After your free trial credits are used or expired, you must enter a payment method to continue using Databricks.
- Sign up through AWS Marketplace: After your free trial credits are used or expired, you will be billed by AWS and manage billing in your AWS console.
As part of the onboarding process after the subscription, a first workspace is created for the trial.
Production deployments
After upgrading your subscription, you can use the Databricks account console to set up your Databricks account and create a workspace. For production setups, use automated installations such as Terraform.
Bootstrap with administrative workspace
The first step before executing any automation or manually creating any workspace is to build one workspace in each required region, used solely for administrative purposes (no data platform user should have access to it).
Main reasons for building an administrative workspace
- Unity Catalog APIs: Unity Catalog APIs are primarily workspace APIs - to create catalogs, locations, and so on, a workspace is required.
- Administrative tasks: Used for running dashboards using system tables, running the Security Analysis Tool (SAT), and other administrative operations.
- Automation hub: Central location for running Terraform, Declarative Automation Bundles, and other automation tools.
Best practices for administrative workspaces
- Create one administrative workspace per region.
- Restrict access to platform administrators only.
- Use this workspace for Unity Catalog management and system table queries.
- Deploy automation tools (for example, Declarative Automation Bundles, Terraform runners) in this workspace.
- Document the purpose and access restrictions.
Design deployment patterns
Pattern 1: Workspace deployment pattern
Create Terraform modules for common workspace deployment patterns mapped to personas and use cases.
Example workspace patterns
- Data engineering workspace: Classic compute, customer-managed VPC, Lakeflow Spark Declarative Pipelines enabled.
- Analytics workspace: Serverless compute, SQL warehouses, BI integrations.
- ML workspace: GPU clusters, ML runtime, MLflow enabled.
- Production workspace: High availability, Private Link, customer-managed keys.
Benefits
- Consistent deployments across environments.
- Reduced configuration errors.
- Faster provisioning of new workspaces.
- Clear separation of concerns by persona.
Pattern 2: Environment promotion pattern
Create patterns for promoting workloads from development to staging to production.
Environment promotion workflow
- Development: Deploy to dev workspace using Declarative Automation Bundles with dev configuration
- Staging: Promote to staging workspace, run integration tests
- Production: Promote to production workspace after approval
Best practices
- Use Declarative Automation Bundles for environment-specific configurations.
- Automate promotion through CI/CD pipelines.
- Require approval gates for production deployments.
- Test in staging before production promotion.
Pattern 3: Reusable modules pattern
Build a library of Terraform modules as building blocks for common patterns.
Example modules
- Workspace module: Deploys workspace with network, storage, and Unity Catalog attachment.
- Unity Catalog catalog module: Creates catalog with schemas and grants.
- Cluster policy module: Defines cluster policies for different teams or use cases.
- Network module: Creates VPC/VNet with subnets, NAT, and firewall rules.
Benefits
- Reduces code duplication.
- Establishes organizational standards.
- Simplifies maintenance and updates.
- Enables self-service for teams.
Infrastructure as Code recommendations
Recommended
- Use an IaC tool to launch workspaces and infrastructure wherever possible.
- Use Terraform for infrastructure resources (for example, workspaces, networks, Unity Catalog, storage).
- Use Declarative Automation Bundles for data and AI workloads (for example, jobs, pipelines, notebooks, models).
- Use existing patterns available for each cloud (Terraform provider examples).
- Build one administrative workspace in each required region.
- Create reusable Terraform modules mapped to personas and use cases.
- Store IaC state in remote backends with state locking.
- Integrate IaC deployments with CI/CD pipelines.
- Document deployment patterns and module usage.
- Use consistent tagging across all resources.
Avoid these patterns
- Do not manually create workspaces in production (use IaC for repeatability).
- Do not store Terraform state locally (use remote backends).
- Do not deploy without plan review (enable plan approval in CI/CD).
- Do not mix IaC and manual configuration (choose one approach).
- Avoid creating snowflake workspaces with unique configurations.
Phase 7 outcomes
After completing Phase 7, you should have:
- IaC tool strategy selected (Terraform for infrastructure, Declarative Automation Bundles for workloads).
- Cloud-native IaC tool usage defined (for example, CloudFormation, ARM/Bicep).
- Subscription and account setup approach planned.
- Administrative workspace design defined (one per region).
- Deployment patterns designed (for example, workspace patterns, environment promotion, reusable modules).
- Terraform module library planned (for example, workspace, Unity Catalog, network, cluster policy modules).
- CI/CD integration strategy defined for IaC deployments.
- Remote state management strategy designed (for example, S3, Azure Blob, GCS).
- Tagging and naming standards defined for IaC-managed resources.
Next phase: Phase 8: Design compute configuration
Implementation guidance: For step-by-step instructions to implement your IaC strategy, see Databricks Terraform provider and What are Declarative Automation Bundles?.