sync
command group
Note
This information applies to Databricks CLI versions 0.205 and above, which are in Public Preview. To find your version of the Databricks CLI, run databricks -v
.
Also, note that the sync
command group can synchronize file changes from a local development machine only to workspace user (/Users
) files in your Databricks workspace. It cannot synchronize to DBFS (dbfs:/
) files. To synchronize file changes from a local development machine to DBFS (dbfs:/
) in your Databricks workspace, use the dbx sync utility.
The sync
command group within the Databricks CLI enables one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Databricks workspace.
Note
sync
commands cannot synchronize file changes from a directory within a remote Databricks workspace, back to a directory within a local filesystem.
You run sync
commands by appending them to databricks sync
. To display help for the sync
command, run databricks sync -h
.
Important
To install the Databricks CLI, see Install or update the Databricks CLI. To configure authentication for the Databricks CLI, see Authentication for the Databricks CLI.
Incrementally sync local file changes to a remote directory
To perform a single, incremental, one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Databricks workspace, run the sync
command, as follows:
databricks sync <local-directory-path> <remote-directory-path>
For example, to do a one-time, one-way, incremental synchronization of all file changes in the folder named my-folder
within the local current working directory, to a specific path within the remote workspace, run the following command:
databricks sync ./my-folder/ /Users/someone@example.com/
In this example, only file changes since the last run of the sync
command are synchronized to /Users/someone@example.com/
. By default, the workspace URL within the caller’s DEFAULT
profile is used to determine the remote workspace to sync to.
Fully sync local file changes to a remote directory
To perform a single, full, one-way synchronization of file changes within a local filesystem directory to a directory within a remote Databricks workspace, regardless of when the last sync
command was run, use the --full
option, for example:
databricks sync ./my-folder/ /Users/someone@example.com/ --full
Continuously sync local file changes to a remote directory
To turn on continuous, one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Databricks workspace, use the --watch
option, for example:
databricks sync ./my-folder/ /Users/someone@example.com/ --watch
One-way synchronization continues until the command is stopped from the terminal, typically by pressing Ctrl + c
or Ctrl + z
.
Polling for possible synchronization events happens once per second by default. To change this interval, use the --interval
option along with the number of seconds to poll followed by the character s
, for example for five seconds:
databricks sync ./my-folder/ /Users/someone@example.com/ --watch --interval 5s
Change the sync progress output format
Sync progress information is output to the terminal in text format by default. To specify the sync progress output format, use the --output
option, specifying either text
(the default, if --output
is not otherwise specified) or json
, for example:
databricks sync ./my-folder/ /Users/someone@example.com/ --output json