Skip to main content

Configure environment versions for pipelines

Beta

Environment versions for SDP are in Beta.

An environment version pins the Python language version and the set of preinstalled Python libraries available to your pipeline's Python code. Any external dependencies you add to the pipeline are layered on top of this base.

Environment versions decouple your pipeline's Python runtime from the Databricks Runtime version your pipeline runs on. While an environment version is set, Databricks Runtime upgrades don't change your Python language version or preinstalled library versions. The Python runtime is also consistent with serverless Jobs and notebooks that use the same environment version. To find the current Databricks Runtime version for Lakeflow Spark Declarative Pipelines, see Lakeflow Spark Declarative Pipelines release notes and the release upgrade process.

important

Pipelines with an environment version run Python code through Spark Connect. Spark Connect changes the behavior of pipeline code. Before enabling an environment version on an existing pipeline, see Environment version compatibility for limitations, behavior changes, the compatibility scan, and the migration workflow.

Requirements

Environment versions have the following requirements:

  • The pipeline must use Unity Catalog. Hive metastore pipelines are not supported.

Supported environment versions

SDP supports environment versions 3 and 4 on both serverless and classic compute. For the Python language version and the full list of preinstalled Python libraries available in each version, see the environment version reference.

Enable an environment version on a pipeline

You can configure an environment version through the pipeline editor UI, the Pipelines REST API, or Declarative Automation Bundles.

Remember to check compatibility with Spark Connect before enabling an environment version on a pipeline.

Enable through the UI

  1. From the pipeline editor, click Settings.
  2. Under Pipeline Environment, select Pencil icon. Edit environment.
  3. Select an environment version from the dropdown list.
  4. Save the pipeline settings.

External dependencies added in the Pipeline Environment section are layered on top of the libraries included with the selected environment version. See Manage Python dependencies for pipelines.

Enable through the API

The Pipelines REST API accepts an environment block on pipeline create and update. Personal Access Token authentication must be enabled for the workspace.

To create a pipeline with an environment version:

Shell
curl --request POST \
--url 'https://<workspace-host>/api/2.0/pipelines' \
--header 'Authorization: Bearer <personal-access-token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "<pipeline-name>",
"catalog": "<catalog>",
"schema": "<schema>",
"channel": "CURRENT",
"environment": {
"environment_version": "4",
"dependencies": [
"simplejson==3.19.*"
]
}
}'

To set the environment version on an existing pipeline, send the same environment block with PUT /api/2.0/pipelines/<pipeline-id>.

Enable through Declarative Automation Bundles

When you create a pipeline using Declarative Automation Bundles, you can set an environment version in the YAML defintion of the pipeline.

  1. Make sure your Databricks CLI is at version v0.294.0 or later. If not, upgrade by following the installation guide.
  2. Set up a bundle by following the pipelines bundle tutorial.
  3. Locate the pipeline YAML in your bundle, typically <bundle-folder>/resources/<pipeline_name>_pipeline.yml.
  4. Set the environment_version and dependencies fields in the pipeline YAML:
YAML
resources:
pipelines:
my_pipeline:
name: my_pipeline
catalog: ${var.catalog}
schema: ${var.schema}
root_path: '../src/my_pipeline'
libraries:
- glob:
include: ../src/my_pipeline/transformations/**
environment:
environment_version: 4
dependencies:
- --editable ${workspace.file_path}

Check the environment version on a pipeline

To check whether an environment version is configured on a pipeline:

  • UI: Open the pipeline settings and check the Pipeline Environment section, or inspect the JSON panel for the environment.environment_version field.
  • API: Call GET /api/2.0/pipelines/<pipeline-id> and look for environment.environment_version in the response.
  • Event log: Inspect the create_update event for the environment_version field.

Disable the environment version on a pipeline

Remove the environment version through the Pipeline Environment section in pipeline settings, or remove the environment_version field from the environment block in the API or bundle definition.

When the environment version is removed, the pipeline returns to its previous Python runtime configuration.

See also