sync command group

Note

This information applies to Databricks CLI versions 0.200 and above, which are in Public Preview. To find your version of the Databricks CLI, run databricks -v.

Also, note that the sync command group can synchronize file changes from a local development machine only to workspace user (/Users) files or to Databricks Repos (/Repos) in your Databricks workspace. It cannot synchronize to DBFS (dbfs:/) files. To synchronize file changes from a local development machine to DBFS (dbfs:/) in your Databricks workspace, use the dbx sync utility.

The sync command group within the Databricks CLI enables one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Databricks workspace.

Note

sync commands cannot synchronize file changes from a directory within a remote Databricks workspace, back to a directory within a local filesystem.

You run sync commands by appending them to databricks sync. To display help for the sync command, run databricks sync -h.

Important

Before you use the Databricks CLI, be sure to set up the Databricks CLI and set up authentication for the Databricks CLI.

Incrementally sync local file changes to a remote directory

To perform a single, incremental, one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Databricks workspace, run the sync command, as follows:

databricks sync <local-directory-path> <remote-directory-path>

For example, to do a one-time, one-way, incremental synchronization of all file changes in the folder named my-folder within the local current working directory, to a specific path within the remote workspace, run the following command:

databricks sync ./my-folder/ /Users/someone@example.com/

In this example, only file changes since the last run of the sync command are synchronized to /Users/someone@example.com/. By default, the workspace URL within the caller’s DEFAULT profile is used to determine the remote workspace to sync to.

Fully sync local file changes to a remote directory

To perform a single, full, one-way synchronization of file changes within a local filesystem directory to a directory within a remote Databricks workspace, regardless of when the last sync command was run, use the --full option, for example:

databricks sync ./my-folder/ /Users/someone@example.com/ --full

Continuously sync local file changes to a remote directory

To turn on continuous, one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Databricks workspace, use the --watch option, for example:

databricks sync ./my-folder/ /Users/someone@example.com/ --watch

One-way synchronization continues until the command is stopped from the terminal, typically by pressing Ctrl + c or Ctrl + z.

Polling for possible synchronization events happens once per second by default. To change this interval, use the --interval option along with the number of seconds to poll followed by the character s, for example for five seconds:

databricks sync ./my-folder/ /Users/someone@example.com/ --watch --interval 5s

Change the sync progress output format

Sync progress information is output to the terminal in text format by default. To specify the sync progress output format, use the --output option, specifying either text (the default, if --output is not otherwise specified) or json, for example:

databricks sync ./my-folder/ /Users/someone@example.com/ --output json