Build a Python wheel file using Databricks Asset Bundles
This article describes how to build, deploy, and run a Python wheel file as part of a Databricks Asset Bundle project. See What are Databricks Asset Bundles?.
For an example configuration that builds a JAR and uploads it to Unity Catalog, see Bundle that uploads a JAR file to Unity Catalog.
Requirements
- Databricks CLI version 0.218.0 or above is installed, and authentication is configured. To check your installed version of the Databricks CLI, run the command
databricks -v
. To install the Databricks CLI, see Install or update the Databricks CLI. To configure authentication, see Configure access to your workspace. - The remote workspace must have workspace files enabled. See What are workspace files?.
Create the bundle using a template
In these steps, you create the bundle using the Databricks default bundle template for Python. This bundle consists of files to build into a Python wheel file and the definition of a Databricks job to build this Python wheel file. You then validate, deploy, and build the deployed files into a Python wheel file from the Python wheel job within your Databricks workspace.
The Databricks default bundle template for Python uses uv to build the Python wheel file. To install uv
, see Installing uv.
If you want to create a bundle from scratch, see Create a bundle manually.
Step 1: Create the bundle
A bundle contains the artifacts you want to deploy and the settings for the workflows you want to run.
-
Use your terminal or command prompt to switch to a directory on your local development machine that will contain the template's generated bundle.
-
Use the Databricks CLI version to run the
bundle init
command:Bashdatabricks bundle init
-
For
Template to use
, leave the default value ofdefault-python
by pressingEnter
. -
For
Unique name for this project
, leave the default value ofmy_project
, or type a different value, and then pressEnter
. This determines the name of the root directory for this bundle. This root directory is created within your current working directory. -
For
Include a stub (sample) notebook
, selectno
and pressEnter
. This instructs the Databricks CLI to not add a sample notebook to your bundle. -
For
Include a stub (sample) Delta Live Tables pipeline
, selectno
and pressEnter
. This instructs the Databricks CLI to not define a sample pipeline in your bundle. -
For
Include a stub (sample) Python package
, leave the default value ofyes
by pressingEnter
. This instructs the Databricks CLI to add sample Python wheel package files and related build instructions to your bundle. -
For
Use serverless
, selectyes
and pressEnter
. This instructs the Databricks CLI to configure your bundle to run on serverless compute.
Step 2: Explore the bundle
To view the files that the template generated, switch to the root directory of your newly created bundle and open this directory with your preferred IDE. Files of particular interest include the following:
databricks.yml
: This file specifies the bundle's name, specifieswhl
build settings, includes a reference to the job configuration file, and defines settings for target workspaces.resources/<project-name>_job.yml
: This file specifies the Python wheel job's settings.src/<project-name>
: This directory includes the files that the Python wheel job uses to build the Python wheel file.
If you want to install the Python wheel file on a cluster with Databricks Runtime 12.2 LTS or below, you must add the following top-level mapping to the databricks.yml
file:
# Applies to all tasks of type python_wheel_task.
experimental:
python_wheel_wrapper: true
Step 3: Validate the project's bundle configuration file
In this step, you check whether the bundle configuration is valid.
-
From the root directory, use the Databricks CLI to run the
bundle validate
command, as follows:Bashdatabricks bundle validate
-
If a summary of the bundle configuration is returned, then the validation succeeded. If any errors are returned, fix the errors, and then repeat this step.
If you make any changes to your bundle after this step, you should repeat this step to check whether your bundle configuration is still valid.
Step 4: Build the Python wheel file and deploy the local project to the remote workspace
In this step, the Python wheel file is built and deployed to your remote Databricks workspace, and a Databricks job is created within your workspace.
-
Use the Databricks CLI to run the
bundle deploy
command as follows:Bashdatabricks bundle deploy -t dev
-
To check whether the locally built Python wheel file was deployed:
- In your Databricks workspace's sidebar, click Workspace.
- Click into the following folder: Workspace > Users >
<your-username>
> .bundle ><project-name>
> dev > artifacts > .internal ><random-guid>
.
The Python wheel file should be in this folder.
-
To check whether the job was created:
- In your Databricks workspace's sidebar, click Jobs & Pipelines.
- Optionally, select the Jobs and Owned by me filters.
- Click [dev
<your-username>
]<project-name>
_job. - Click the Tasks tab.
There should be one task: main_task.
If you make any changes to your bundle after this step, repeat steps 3-4 to check whether your bundle configuration is still valid and then redeploy the project.
Step 5: Run the deployed project
In this step, you run the Databricks job in your workspace.
-
From the root directory, use the Databricks CLI to run the
bundle run
command, as follows, replacing<project-name>
with the name of your project from Step 1:Bashdatabricks bundle run -t dev <project-name>_job
-
Copy the value of
Run URL
that appears in your terminal and paste this value into your web browser to open your Databricks workspace. -
In your Databricks workspace, after the task completes successfully and shows a green title bar, click the main_task task to see the results.
Build the whl using Poetry or setuptools
When you use databricks bundle init
with the default-python template, a bundle is created that shows how to configure a bundle that builds a Python wheel using uv
and pyproject.toml
. However, you may want to use Poetry or setuptools
instead to build a wheel.
Install Poetry or setuptools
-
Install Poetry or
setuptools
:- Poetry
- Setuptools
- Install Poetry, version 1.6 or above, if it is not already installed. To check your installed version of Poetry, run the command
poetry -V
orpoetry --version
. - Make sure you have Python version 3.10 or above installed. To check your version of Python, run the command
python -V
orpython --version
.
Install the
wheel
andsetuptools
packages if they are not already installed, by running the following command:Bashpip3 install --upgrade wheel setuptools
-
If you intend to store this bundle with a Git provider, add a
.gitignore
file in the project's root, and add the following entries to this file:- Poetry
- Setuptools
.databricks
dist.databricks
build
dist
src/my_package/my_package.egg-info
Add build files
-
In your bundle's root, create the following folders and files, depending on whether you use Poetry or
setuptools
for building Python wheel files:- Poetry
- Setuptools
├── src
│ └── my_package
│ ├── __init__.py
│ ├── main.py
│ └── my_module.py
└── pyproject.toml├── src
│ └── my_package
│ ├── __init__.py
│ ├── main.py
│ └── my_module.py
└── setup.py -
Add the following code to the
pyproject.toml
orsetup.py
file:- Pyproject.toml
- Setup.py
[tool.poetry]
name = "my_package"
version = "0.0.1"
description = "<my-package-description>"
authors = ["my-author-name <my-author-name>@<my-organization>"]
[tool.poetry.dependencies]
python = "^3.10"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
[tool.poetry.scripts]
main = "my_package.main:main"- Replace
my-author-name
with your organization's primary contact name. - Replace
my-author-name>@<my-organization
with your organization's primary email contact address. - Replace
<my-package-description>
with a display description for your Python wheel file.
Pythonfrom setuptools import setup, find_packages
import src
setup(
name = "my_package",
version = "0.0.1",
author = "<my-author-name>",
url = "https://<my-url>",
author_email = "<my-author-name>@<my-organization>",
description = "<my-package-description>",
packages=find_packages(where='./src'),
package_dir={'': 'src'},
entry_points={
"packages": [
"main=my_package.main:main"
]
},
install_requires=[
"setuptools"
]
)- Replace
https://<my-url>
with your organization's URL. - Replace
<my-author-name>
with your organization's primary contact name. - Replace
<my-author-name>@<my-organization>
with your organization's primary email contact address. - Replace
<my-package-description>
with a display description for your Python wheel file.
Add artifacts bundle configuration
-
Add the
artifacts
mapping configuration to yourdatabricks.yml
to build thewhl
artifact:- Poetry
- Setuptools
This configuration runs the
poetry build
command and indicates the path to thepyproject.toml
file is in the same directory as thedatabricks.yml
file.noteIf you have already built a Python wheel file and just want to deploy it, then modify the following bundle configuration file by omitting the
artifacts
mapping. The Databricks CLI will then assume that the Python wheel file is already built and will automatically deploy the files that are specified in thelibraries
array'swhl
entries.YAMLbundle:
name: my-wheel-bundle
artifacts:
default:
type: whl
build: poetry build
path: .
resources:
jobs:
wheel-job:
name: wheel-job
tasks:
- task_key: wheel-task
new_cluster:
spark_version: 13.3.x-scala2.12
node_type_id: i3.xlarge
data_security_mode: USER_ISOLATION
num_workers: 1
python_wheel_task:
entry_point: main
package_name: my_package
libraries:
- whl: ./dist/*.whl
targets:
dev:
workspace:
host: <workspace-url>This configuration runs the
setuptools
command and indicates the path to thesetup.py
file is in the same directory as thedatabricks.yml
file.YAMLbundle:
name: my-wheel-bundle
artifacts:
default:
type: whl
build: python3 setup.py bdist wheel
path: .
resources:
jobs:
wheel-job:
name: wheel-job
tasks:
- task_key: wheel-task
new_cluster:
spark_version: 13.3.x-scala2.12
node_type_id: i3.xlarge
data_security_mode: USER_ISOLATION
num_workers: 1
python_wheel_task:
entry_point: main
package_name: my_package
libraries:
- whl: ./dist/*.whl
targets:
dev:
workspace:
host: <workspace-url>
:::