sync command group
Note
This information applies to Databricks CLI versions 0.205 and above, which are in Public Preview. To find your version of the Databricks CLI, run databricks -v
.
Also, note that the sync
command group can synchronize file changes from a local development machine only to workspace user (/Users
) files or to Databricks Repos (/Repos
) in your Databricks workspace. It cannot synchronize to DBFS (dbfs:/
) files. To synchronize file changes from a local development machine to DBFS (dbfs:/
) in your Databricks workspace, use the dbx sync utility.
The sync
command group within the Databricks CLI enables one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Databricks workspace.
Note
sync
commands cannot synchronize file changes from a directory within a remote Databricks workspace, back to a directory within a local filesystem.
You run sync
commands by appending them to databricks sync
. To display help for the sync
command, run databricks sync -h
.
Important
Before you use the Databricks CLI, be sure to set up the Databricks CLI and set up authentication for the Databricks CLI.
Incrementally sync local file changes to a remote directory
To perform a single, incremental, one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Databricks workspace, run the sync
command, as follows:
databricks sync <local-directory-path> <remote-directory-path>
For example, to do a one-time, one-way, incremental synchronization of all file changes in the folder named my-folder
within the local current working directory, to a specific path within the remote workspace, run the following command:
databricks sync ./my-folder/ /Users/someone@example.com/
In this example, only file changes since the last run of the sync
command are synchronized to /Users/someone@example.com/
. By default, the workspace URL within the caller’s DEFAULT
profile is used to determine the remote workspace to sync to.
Fully sync local file changes to a remote directory
To perform a single, full, one-way synchronization of file changes within a local filesystem directory to a directory within a remote Databricks workspace, regardless of when the last sync
command was run, use the --full
option, for example:
databricks sync ./my-folder/ /Users/someone@example.com/ --full
Continuously sync local file changes to a remote directory
To turn on continuous, one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Databricks workspace, use the --watch
option, for example:
databricks sync ./my-folder/ /Users/someone@example.com/ --watch
One-way synchronization continues until the command is stopped from the terminal, typically by pressing Ctrl + c
or Ctrl + z
.
Polling for possible synchronization events happens once per second by default. To change this interval, use the --interval
option along with the number of seconds to poll followed by the character s
, for example for five seconds:
databricks sync ./my-folder/ /Users/someone@example.com/ --watch --interval 5s
Change the sync progress output format
Sync progress information is output to the terminal in text format by default. To specify the sync progress output format, use the --output
option, specifying either text
(the default, if --output
is not otherwise specified) or json
, for example:
databricks sync ./my-folder/ /Users/someone@example.com/ --output json