Skip to main content

sync command

note

This information applies to Databricks CLI versions 0.205 and above. The Databricks CLI is in Public Preview.

Databricks CLI use is subject to the Databricks License and Databricks Privacy Notice, including any Usage Data provisions.

The sync command group within the Databricks CLI enables one-way synchronization of local code and file changes in a directory on your local development machine to a folder in your remote Databricks workspace.

note
  • sync cannot synchronize file changes from a folder in a remote Databricks workspace back to a directory on your local development machine.
  • sync can synchronize file changes from a local development machine only to workspace user (/Users) files in your Databricks workspace. It cannot synchronize to DBFS (dbfs:/) files. To synchronize file changes from a local development machine to DBFS (dbfs:/) in your Databricks workspace, use the dbx sync utility.

databricks sync

Synchronize a local directory to a workspace directory.

databricks sync [flags] SRC DST

Arguments

SRC

    The source directory path

DST

    The destination directory path

Options

--dry-run

    Simulate sync execution without making actual changes

--exclude strings

    Patterns to exclude from sync (can be specified multiple times)

--exclude-from string

    File containing patterns to exclude from sync (one pattern per line)

--full

    Perform full synchronization (default is incremental)

--include strings

    Patterns to include in sync (can be specified multiple times)

--include-from string

    File containing patterns to include to sync (one pattern per line)

--interval duration

    File system polling interval (for --watch) (default 1s)

--watch

    Watch local file system for changes

Global flags

Examples

The following sections show how to use the sync command.

Incrementally sync local file changes to a remote directory

To perform a single, incremental, one-way synchronization of file changes within a local directory to a folder in a remote Databricks workspace, run the sync command, as follows:

Bash
databricks sync <local-directory-path> <remote-directory-path>

For example, to do a one-time, one-way, incremental synchronization of all file changes in the folder named my-local-folder in the local current working directory, to the folder my-workspace-folder in the remote workspace, run the following command:

Bash
databricks sync ./my-local-folder /Users/someone@example.com/my-workspace-folder

In this example, only file changes since the last run of the sync command are synchronized to /Users/someone@example.com/my-workspace-folder. By default, the workspace URL within the caller's DEFAULT profile is used to determine the remote workspace to sync to.

Only sync specific files

To include or exclude specific files to sync based on specified patterns, use the --include, --include-from, --exclude, or --exclude-from options.

Bash
databricks sync ./my-local-folder /Users/someone@example.com/my-workspace-folder --exclude-from .gitignore
note

If you want to sync files in a bundle, use the sync configuration mapping instead. See sync.

The following example excludes certain file patterns from sync:

Bash
databricks sync --exclude "*.pyc" --exclude "__pycache__" ./my-local-folder /Users/someone@example.com/my-workspace-folder

Fully sync local file changes to a remote directory

To perform a single, full, one-way synchronization of file changes within a local directory to a folder in a remote Databricks workspace, regardless of when the last sync command was run, use the --full option, for example:

Bash
databricks sync ./my-local-folder /Users/someone@example.com/my-workspace-folder --full

Continuously sync local file changes to a remote directory

To turn on continuous, one-way synchronization of file changes within a local directory, to a folder in a remote Databricks workspace, use the --watch option, for example:

Bash
databricks sync ./my-local-folder /Users/someone@example.com/my-workspace-folder --watch

One-way synchronization continues until the command is stopped from the terminal, typically by pressing Ctrl + c or Ctrl + z.

Polling for possible synchronization events happens once per second by default. To change this interval, use the --interval option along with the number of seconds to poll followed by the character s, for example for five seconds:

Bash
databricks sync ./my-local-folder /Users/someone@example.com/my-workspace-folder --watch --interval 5s

Change the sync progress output format

Sync progress information is output to the terminal in text format by default. To specify the sync progress output format, use the --output option, specifying either text (the default, if --output is not otherwise specified) or json, for example:

Bash
databricks sync ./my-local-folder /Users/someone@example.com/my-workspace-folder --output json

Preview file operations for a sync

To preview a sync without actually performing the file sync operations, use the --dry-run option, for example:

Bash
databricks sync ./my-local-folder /Users/someone@example.com/my-workspace-folder --dry-run
Output
Warn: Running in dry-run mode. No actual changes will be made.
Action: PUT: test.txt
Uploaded test.txt
Initial Sync Complete

Global flags

--debug

  Whether to enable debug logging.

-h or --help

    Display help for the Databricks CLI or the related command group or the related command.

--log-file string

    A string representing the file to write output logs to. If this flag is not specified then the default is to write output logs to stderr.

--log-format format

    The log format type, text or json. The default value is text.

--log-level string

    A string representing the log format level. If not specified then the log format level is disabled.

-o, --output type

    The command output type, text or json. The default value is text.

-p, --profile string

    The name of the profile in the ~/.databrickscfg file to use to run the command. If this flag is not specified then if it exists, the profile named DEFAULT is used.

--progress-format format

    The format to display progress logs: default, append, inplace, or json

-t, --target string

    If applicable, the bundle target to use