This guide describes how to set up version control for notebooks using Bitbucket Cloud through the UI. Although this document describes how to set up Bitbucket Cloud integration through the UI, you can also integrate through the Databricks CLI or Workspace API.
- Configure version control
- Work with notebook revisions
- Best practice for code reviews
- Bitbucket Server
Configuring version control involves creating access credentials in your version control provider and adding those credentials to Databricks.
- Go to Bitbucket Cloud and create an app password that allows access to your repositories. See the Bitbucket Cloud documentation.
- Record the password. You enter this password in Databricks in the next step.
Click the User icon at the top right of your screen and select User Settings.
Click the Git Integration tab.
If you have previously entered credentials, click the Change token or app password button.
In the Git provider drop-down, select Bitbucket Cloud.
Paste your password and username into the respective fields and click Save.
You work with notebook revisions in the History panel. Open the history panel by clicking Revision history at the top right of the notebook.
You cannot modify a notebook while the History panel is open.
While the changes that you make to your notebook are saved automatically to the Databricks revision history, changes do not automatically persist to Bitbucket Cloud.
Once you link a notebook, Databricks syncs your history with Git every time you re-open the History panel. Versions that sync to Git have commit hashes as part of the entry.
Open the History panel.
Choose an entry in the History panel. Databricks displays that version.
Click Restore this version.
Click Confirm to confirm that you want to restore that version.
You can work on any branch of your repository and create new branches inside Databricks.
Open History panel.
Click the Git status bar to open the GitHub panel.
Click the Branch dropdown.
Enter a branch name.
Select the Create Branch option at the bottom of the dropdown. The parent branch is indicated. You always branch from your current selected branch.
Databricks supports Git branching.
- You can link a notebook to your own fork and choose a branch.
- We recommend using separate branches for each notebook.
- Once you are happy with your changes, you can use the Create PR link in the Git Preferences dialog to take you to Bitbucket Cloud’s pull request page.
- The Create PR link displays only if you’re not working on the default branch of the parent repository.
Bitbucket Server integration is not supported. However, you can use the Workspace API to programmatically create notebooks and manage the code base in Bitbucket Server.
If you receive errors related to Bitbucket Cloud history sync, verify the following:
- You have initialized the repository on Bitbucket Cloud, and it isn’t empty. Try the URL that you entered and verify that it forwards to your Bitbucket Cloud repository.
- Your app password is active and your username is correct.
- If the repository is private, you should have read and write access (through Bitbucket Cloud) on the repository.