Delta Live Tables API guide

The Delta Live Tables API allows you to create, edit, delete, start, and view details about pipelines.

Important

To access Databricks REST APIs, you must authenticate.

Create a pipeline

Endpoint HTTP Method
2.0/pipelines POST

Creates a new Delta Live Tables pipeline.

Example

This example creates a new triggered pipeline.

Request

curl --netrc --request POST \
https://<databricks-instance>/api/2.0/pipelines \
--data @pipeline-settings.json

pipeline-settings.json:

{
  "name": "Wikipedia pipeline (SQL)",
  "storage": "/Users/username/data",
  "clusters": [
    {
      "label": "default",
      "autoscale": {
        "min_workers": 1,
        "max_workers": 5
      }
    }
  ],
  "libraries": [
    {
      "notebook": {
        "path": "/Users/username/DLT Notebooks/Delta Live Tables quickstart (SQL)"
      }
    }
  ],
  "continuous": false
}

Replace:

  • <databricks-instance> with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com.

This example uses a .netrc file.

Response

{
  "pipeline_id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5"
}

Request structure

See PipelineSettings.

Response structure

Field Name Type Description
pipeline_id STRING The unique identifier for the newly created pipeline.

Edit a pipeline

Endpoint HTTP Method
2.0/pipelines/{pipeline_id} PUT

Updates the settings for an existing pipeline.

Example

This example adds a target parameter to the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5:

Request

curl --netrc --request PUT \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5 \
> --data @pipeline-settings.json

pipeline-settings.json

{
  "id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
  "name": "Wikipedia pipeline (SQL)",
  "storage": "/Users/username/data",
  "clusters": [
    {
      "label": "default",
      "autoscale": {
        "min_workers": 1,
        "max_workers": 5
      }
    }
  ],
  "libraries": [
    {
      "notebook": {
        "path": "/Users/username/DLT Notebooks/Delta Live Tables quickstart (SQL)"
      }
    }
  ],
  "target": "wikipedia_quickstart_data",
  "continuous": false
}

Replace:

  • <databricks-instance> with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com.

This example uses a .netrc file.

Request structure

See PipelineSettings.

Delete a pipeline

Endpoint HTTP Method
2.0/pipelines/{pipeline_id} DELETE

Deletes a pipeline from the Delta Live Tables system.

Example

This example deletes the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5:

Request

curl --netrc --request DELETE \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5

Replace:

  • <databricks-instance> with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com.

This example uses a .netrc file.

Start a pipeline update

Endpoint HTTP Method
2.0/pipelines/{pipeline_id}/updates POST

Starts an update for a pipeline.

Example

This example starts an update with full refresh for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5:

Request

curl --netrc --request POST \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/updates \
--data '{ "full_refresh": "true" }'

Replace:

  • <databricks-instance> with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com.

This example uses a .netrc file.

Response

{
  "update_id": "a1b23c4d-5e6f-78gh-91i2-3j4k5lm67no8"
}

Request structure

Field Name Type Description
full_refresh BOOLEAN

Whether to reprocess all data. If true, the Delta Live Tables system will reset all tables before running the pipeline.

This field is optional.

The default value is false

Response structure

Field Name Type Description
update_id STRING The unique identifier of the newly created update.

Stop any active pipeline update

Endpoint HTTP Method
2.0/pipelines/{pipeline_id}/stop POST

Stops any active pipeline update. If no update is running, this request is a no-op.

Example

This example stops an update for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5:

Request

curl --netrc --request POST \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/stop

Replace:

  • <databricks-instance> with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com.

This example uses a .netrc file.

List pipeline events

Endpoint HTTP Method
2.0/pipelines/{pipeline_id}/events GET

Retrieves events for a pipeline.

Example

This example retrieves a maximum of 5 events for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5.

Request

curl -n -X GET \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/events \
--data '{"max_results": 5}'

Replace:

  • <databricks-instance> with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com.

This example uses a .netrc file.

Request structure

Field Name Type Description
page_token STRING

Page token returned by previous call. This field is mutually exclusive with all fields in this request except max_results. An error is returned if any fields other than max_results are set when this field is set.

This field is optional.

max_results INT32

The maximum number of entries to return in a single page. The system may return fewer than max_results events in a response, even if there are more events available.

This field is optional.

The default value is 25.

The maximum value is 100. An error is returned if the value of max_results is greater than 100.

order_by STRING

A string indicating a sort order by timestamp for the results, for example, ["timestamp asc"].

The sort order can be ascending or descending. By default, events are returned in descending order by timestamp.

This field is optional.

filter STRING

Criteria to select a subset of results, expressed using a SQL-like syntax. The supported filters are:

  • level='INFO' (or WARN or ERROR)
  • level in ('INFO', 'WARN')
  • id='[event-id]'
  • timestamp > 'TIMESTAMP' (or >=,<,<=,=)

Composite expressions are supported, for example: level in ('ERROR', 'WARN') AND timestamp> '2021-07-22T06:37:33.083Z'

This field is optional.

Response structure

Field Name Type Description
events An array of pipeline events. The list of events matching the request criteria.
next_page_token STRING If present, a token to fetch the next page of events.
prev_page_token STRING If present, a token to fetch the previous page of events.

Get pipeline details

Endpoint HTTP Method
2.0/pipelines/{pipeline_id} GET

Gets details about a pipeline, including the pipeline settings and recent updates.

Example

This example gets details for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5:

Request

curl -n -X GET \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5

Replace:

  • <databricks-instance> with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com.

This example uses a .netrc file.

Response

{
  "pipeline_id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
  "spec": {
    "id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
    "name": "Wikipedia pipeline (SQL)",
    "storage": "/Users/username/data",
    "clusters": [
      {
        "label": "default",
        "autoscale": {
          "min_workers": 1,
          "max_workers": 5
        }
      }
    ],
    "libraries": [
      {
        "notebook": {
          "path": "/Users/username/DLT Notebooks/Delta Live Tables quickstart (SQL)"
        }
      }
    ],
    "target": "wikipedia_quickstart_data",
    "continuous": false
  },
  "state": "IDLE",
  "cluster_id": "1234-567891-abcde123",
  "name": "Wikipedia pipeline (SQL)",
  "creator_user_name": "username",
  "latest_updates": [
    {
      "update_id": "8a0b6d02-fbd0-11eb-9a03-0242ac130003",
      "state": "COMPLETED",
      "creation_time": "2021-08-13T00:37:30.279Z"
    },
    {
      "update_id": "a72c08ba-fbd0-11eb-9a03-0242ac130003",
      "state": "CANCELED",
      "creation_time": "2021-08-13T00:35:51.902Z"
    },
    {
      "update_id": "ac37d924-fbd0-11eb-9a03-0242ac130003",
      "state": "FAILED",
      "creation_time": "2021-08-13T00:33:38.565Z"
    }
  ]
}

Response structure

Field Name Type Description
pipeline_id STRING The unique identifier of the pipeline.
spec PipelineSettings The pipeline settings.
state STRING

The state of the pipeline. One of IDLE or RUNNING.

If state = RUNNING, then there is at least one active update.

cluster_id STRING The identifier of the cluster running the pipeline.
name STRING The user-friendly name for this pipeline.
creator_user_name STRING The username of the pipeline creator.
latest_updates An array of UpdateStateInfo Status of the most recent updates for the pipeline, ordered with the newest update first.

Get update details

Endpoint HTTP Method
2.0/pipelines/{pipeline_id}/updates/{update_id} GET

Gets details for a pipeline update.

Example

This example gets details for update 9a84f906-fc51-11eb-9a03-0242ac130003 for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5:

Request

curl -n -X GET \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/updates/9a84f906-fc51-11eb-9a03-0242ac130003

Replace:

  • <databricks-instance> with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com.

This example uses a .netrc file.

Response

{
  "update": {
    "pipeline_id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
    "update_id": "9a84f906-fc51-11eb-9a03-0242ac130003",
    "config": {
      "id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
      "name": "Wikipedia pipeline (SQL)",
      "storage": "/Users/username/data",
      "configuration": {
        "pipelines.numStreamRetryAttempts": "5"
      },
      "clusters": [
        {
          "label": "default",
          "autoscale": {
            "min_workers": 1,
            "max_workers": 5
          }
        }
      ],
      "libraries": [
        {
          "notebook": {
            "path": "/Users/username/DLT Notebooks/Delta Live Tables quickstart (SQL)"
          }
        }
      ],
      "target": "wikipedia_quickstart_data",
      "filters": {},
      "email_notifications": {},
      "continuous": false,
      "development": false
    },
    "cause": "API_CALL",
    "state": "COMPLETED",
    "creation_time": 1628815050279,
    "full_refresh": true
  }
}

Response structure

Field Name Type Description
pipeline_id STRING The unique identifier of the pipeline.
update_id STRING The unique identifier of this update.
config PipelineSettings The pipeline settings.
cause STRING The trigger for the update. One of API_CALL, RETRY_ON_FAILURE, SERVICE_UPGRADE.
state STRING The state of the update. One of QUEUED, CREATED WAITING_FOR_RESOURCES, INITIALIZING, RESETTING, SETTING_UP_TABLES, RUNNING, STOPPING, COMPLETED, FAILED, or CANCELED.
cluster_id STRING The identifier of the cluster running the pipeline.
creation_time INT64 The timestamp when the update was created.
full_refresh BOOLEAN Whether the update was triggered to perform a full refresh. If true, all pipeline tables were reset before running the update.

List pipelines

Endpoint HTTP Method
2.0/pipelines/ GET

Lists pipelines defined in the Delta Live Tables system.

Example

This example retrieves details for up to two pipelines, starting from a specified page_token:

Request

curl -n -X GET https://<databricks-instance>/api/2.0/pipelines \
--data '{ "page_token": "eyJ...==",  "max_results": 2 }'

Replace:

  • <databricks-instance> with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com.

This example uses a .netrc file.

Response

{
  "statuses": [
    {
      "pipeline_id": "e0f01758-fc61-11eb-9a03-0242ac130003",
      "state": "IDLE",
      "name": "dlt-pipeline-python",
      "latest_updates": [
        {
          "update_id": "ee9ae73e-fc61-11eb-9a03-0242ac130003",
          "state": "COMPLETED",
          "creation_time": "2021-08-13T00:34:21.871Z"
        }
      ],
      "creator_user_name": "username"
    },
    {
      "pipeline_id": "f4c82f5e-fc61-11eb-9a03-0242ac130003",
      "state": "IDLE",
      "name": "dlt-pipeline-python",
      "creator_user_name": "username"
    }
  ],
  "next_page_token": "eyJ...==",
  "prev_page_token": "eyJ..x9"
}

Request structure

Field Name Type Description
page_token STRING

Page token returned by previous call.

This field is optional.

max_results INT32

The maximum number of entries to return in a single page. The system may return fewer than max_results events in a response, even if there are more events available.

This field is optional.

The default value is 25.

The maximum value is 100. An error is returned if the value of max_results is greater than 100.

order_by An array of STRING

A list of strings specifying the order of results, for example, ["name asc"]. Supported order_by fields are id and name. The default is id asc.

This field is optional.

filter STRING

Select a subset of results based on the specified criteria.

The supported filters are:

"notebook='<path>'" to select pipelines that reference the provided notebook path.

name LIKE '[pattern]' to select pipelines with a name that matches pattern. Wildcards are supported, for example: name LIKE '%shopping%'

Composite filters are not supported.

This field is optional.

Response structure

Field Name Type Description
statuses An array of PipelineStateInfo The list of events matching the request criteria.
next_page_token STRING If present, a token to fetch the next page of events.
prev_page_token STRING If present, a token to fetch the previous page of events.

Data structures

NotebookLibrary

Field Name Type Description
path STRING

The absolute path to the notebook.

This field is required.

PipelineLibrary

Field Name Type Description
notebook NotebookLibrary The path to a notebook defining Delta Live Tables datasets. The path must be in the Databricks workspace, for example: { "notebook" : { "path" : "/my-pipeline-notebook-path" } }.

PipelineSettings

Specification for a pipeline deployment.

Field Name Type Description
id STRING

The unique identifier for this pipeline.

The identifier is created by the Delta Live Tables system, and must not be provided when creating a pipeline.

name STRING

A user-friendly name for this pipeline.

This field is optional.

By default, the pipeline name must be unique. To use a duplicate name, set allow_duplicate_names to true in the pipeline configuration.

storage STRING

A path to a DBFS directory for storing checkpoints and tables created by the pipeline.

This field is optional.

The system uses a default location if this field is empty.

configuration A map of STRING:STRING

A list of key-value pairs to add to the Spark configuration of the cluster that will run the pipeline.

This field is optional.

Elements must be formatted as key:value pairs.

clusters An array of PipelinesNewCluster

An array of specifications for the clusters to run the pipeline.

This field is optional.

If this is not specified, the system will select a default cluster configuration for the pipeline.

libraries An array of PipelineLibrary The notebooks containing the pipeline code and any dependencies required to run the pipeline.
target STRING

A database name for persisting pipeline output data.

See Publish tables for more information.

continuous BOOLEAN

Whether this is a continuous pipeline.

This field is optional.

The default value is false.

development BOOLEAN

Whether to run the pipeline in development mode.

This field is optional.

The default value is false.

PipelineStateInfo

Field Name Type Description
state STRING The state of the pipeline. One of IDLE or RUNNING.
pipeline_id STRING The unique identifier of the pipeline.
cluster_id STRING The unique identifier of the cluster running the pipeline.
name STRING The user-friendly name of the pipeline.
latest_updates An array of UpdateStateInfo Status of the most recent updates for the pipeline, ordered with the newest update first.
creator_user_name STRING The username of the pipeline creator.

PipelinesNewCluster

A pipeline cluster specification.

Field Name Type Description
label STRING

A label for the cluster specification, either default to configure the default cluster, or maintenance to configure the maintenance cluster.

This field is optional. The default value is default.

attrs AwsAttributes

Optional attributes to set during cluster creation.

These attributes cannot be changed over the lifetime of a cluster.

The Delta Live Tables system sets the following attributes. These attributes cannot be configured by users:

  • spark_version
size ClusterSize

Optional cluster size specification.

Size can either be a constant number of workers or autoscaling parameters.

apply_policy_default_values BOOLEAN Whether to use policy default values for missing cluster attributes.

UpdateStateInfo

Field Name Type Description
update_id STRING The unique identifier for this update.
state STRING The state of the update. One of QUEUED, CREATED, WAITING_FOR_RESOURCES, INITIALIZING, RESETTING, SETTING_UP_TABLES, RUNNING, STOPPING, COMPLETED, FAILED, or CANCELED.
creation_time STRING Timestamp when this update was created.