Migrate to the direct deployment engine
This feature is Experimental.
Databricks Asset Bundles was originally built on top of the Databricks Terraform provider to manage deployments. However, in an effort to move away from this dependency, Databricks CLI version 0.279.0 and above supports two different deployment engines: terraform and direct. The direct deployment engine does not depend on Terraform and will soon become the default. The Terraform deployment engine will eventually be deprecated.
What are the advantages of direct deployment?
The new direct deployment engine uses the Databricks Go SDK and provides the following benefits:
- No requirement to download Terraform and
terraform-provider-databricksbefore deployment - Avoids issues with firewalls, proxies, and custom provider registries
- Detailed diffs of changes available using
bundle plan -o json - Faster deployment
- Reduced time to release new bundle resources, because there is no need to align with the Terraform provider release
How do I start using direct deployment?
To start using the new direct deployment engine:
- For existing bundles, migrate them using
databricks bundle deployment migrate. - For new bundles, deploy them with the
DATABRICKS_BUNDLE_ENGINEenvironment variable set todirect.
Migrate an existing bundle
The direct deployment engine uses its own JSON state file. The schema is different than the Terraform JSON state file. The bundle deployment migrate command converts the Terrform state file (terraform.tfstate) to the direct deployment state file (resources.json). The command reads IDs from the existing deployment.
-
Perform a full deployment with Terraform:
Bashdatabricks bundle deploy -t my_target -
Migrate the deployment:
Bashdatabricks bundle deployment migrate -t my_target -
Verify that the migration was successful. The
databricks bundle plancommand should succeed and it should show no changes.Bashdatabricks bundle plan -t my_target-
If the verification failed, remove the new state file:
Bashrm .databricks/bundle/my_target/resources.json -
If the verification succeeded, deploy the bundle to synchronize the state file to the workspace:
Bashdatabricks bundle deploy -t my_target
-
Direct deploy a new bundle
The bundle migrate command does not work on bundles that have never been deployed because there is no state file. Instead, set the DATABRICKS_BUNDLE_ENGINE environment variable and deploy:
DATABRICKS_BUNDLE_ENGINE=direct databricks bundle deploy -t my_target
What are the changes in the direct deployment engine?
The new direct deployment engine mostly behaves the same as the Terrform deployment engine, but there are some differences.
Resource state diff calculation
Unlike Terraform which maintains a single resource state (a mix of local configuration and remote state), the new engine keeps these separate and only records local configuration in its state file.
The resources state diff calculation is done in two steps:
- The local bundle configuration is compared to the snapshot configuration used for the most recent deployment. The remote state plays no role.
- The remote state is compared to the snapshot configuration used for the most recent deployment.
The result is that:
databricks.ymlresource changes are never ignored and will always trigger an update.- Resource fields not handled by the implementation do not trigger an inconsistent result error. These resources are deployed successfully by the direct engine, but this can result in a drift. The deployed resources are updated during the next plan or deploy.
$resources substitution lookup
The most common use of $resources is resolving substitution IDs (for example, $resources.jobs.my_job.id) which behaves identically between the Terraform and direct deployment engines.
However, the resolution of the $resources substitution in the direct deployment engine (for example, $resources.pipelines.my_pipeline.name) is performed in two steps:
- References pointing to fields that are present in the local config are resolved to the value provided in the local config.
- References that are not present in the local config are resolved from the remote state. This is the state fetched using the appropriate
GETrequest for a given resource.
The schema that is used for $resource resolution is available in the file out.fields.txt. The fields marked as ALL and STATE can be used for local resolution. The fields marked as ALL or REMOTE can be used for remote resolution.