This guide describes how to set up version control for notebooks using Bitbucket Cloud and Bitbucket Server through the UI.
Databricks recommends that you use Git integration with Databricks Repos to sync your work in Databricks with a remote Git repository.
Databricks Repos supports Bitbucket Server integration, if the server is internet accessible.
To integrate with a Bitbucket Server instance that is not internet-accessible, get in touch with your Databricks representative.
By default version control is enabled. To toggle this setting, see Manage the ability to version notebooks in Git. If Git versioning is disabled, the Git Integration tab is not visible in the User Settings screen.
Configuring version control involves creating access credentials in your version control provider and adding those credentials to Databricks.
Go to Bitbucket Cloud and create an app password that allows access to your repositories. See the Bitbucket Cloud documentation.
Record the password. You enter this password in Databricks in the next step.
Click Settings at the lower left of your screen and select User Settings.
Click the Git Integration tab.
If you have previously entered credentials, click the Change settings button.
In the Git provider drop-down, select Bitbucket Cloud.
Paste your app password into the App password field.
Enter your username into the Git provider username field and click Save.
You work with notebook revisions in the History panel. Open the history panel by clicking Revision history at the top right of the notebook.
You cannot modify a notebook while the History panel is open.
While the changes that you make to your notebook are saved automatically to the Databricks revision history, changes do not automatically persist to Bitbucket Cloud.
Open the History panel.
Click Save Now to save your notebook to Bitbucket Cloud. The Save Notebook Revision dialog appears.
Optionally, enter a message to describe your change.
Make sure that Also commit to Git is selected.
Once you link a notebook, Databricks syncs your history with Git every time you re-open the History panel. Versions that sync to Git have commit hashes as part of the entry.
Open the History panel.
Choose an entry in the History panel. Databricks displays that version.
Click Restore this version.
Click Confirm to confirm that you want to restore that version.
Databricks supports Git branching.
You can link a notebook to your own fork and choose a branch.
We recommend using separate branches for each notebook.
Once you are happy with your changes, you can use the Create PR link in the Git Preferences dialog to take you to Bitbucket Cloud’s pull request page.
The Create PR link displays only if you’re not working on the default branch of the parent repository.
If you receive errors related to Bitbucket Cloud history sync, verify the following:
You have initialized the repository on Bitbucket Cloud, and it isn’t empty. Try the URL that you entered and verify that it forwards to your Bitbucket Cloud repository.
Your app password is active and your username is correct.
If the repository is private, you should have read and write access (through Bitbucket Cloud) on the repository.