Delta Live Tables API guide
Important
This article’s content has been retired and might not be updated. See Delta Live Tables in the Databricks REST API Reference.
The Delta Live Tables API allows you to create, edit, delete, start, and view details about pipelines.
Important
To access Databricks REST APIs, you must authenticate.
Create a pipeline
Endpoint |
HTTP Method |
---|---|
|
|
Creates a new Delta Live Tables pipeline.
Example
This example creates a new triggered pipeline.
Request
curl --netrc -X POST \
https://<databricks-instance>/api/2.0/pipelines \
--data @pipeline-settings.json
pipeline-settings.json
:
{
"name": "Wikipedia pipeline (SQL)",
"storage": "/Users/username/data",
"clusters": [
{
"label": "default",
"autoscale": {
"min_workers": 1,
"max_workers": 5,
"mode": "ENHANCED"
}
}
],
"libraries": [
{
"notebook": {
"path": "/Users/username/DLT Notebooks/Delta Live Tables quickstart (SQL)"
}
}
],
"continuous": false
}
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.
This example uses a .netrc file.
Request structure
See PipelineSettings.
Edit a pipeline
Endpoint |
HTTP Method |
---|---|
|
|
Updates the settings for an existing pipeline.
Example
This example adds a target
parameter to the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
:
Request
curl --netrc -X PUT \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5 \
--data @pipeline-settings.json
pipeline-settings.json
{
"id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
"name": "Wikipedia pipeline (SQL)",
"storage": "/Users/username/data",
"clusters": [
{
"label": "default",
"autoscale": {
"min_workers": 1,
"max_workers": 5,
"mode": "ENHANCED"
}
}
],
"libraries": [
{
"notebook": {
"path": "/Users/username/DLT Notebooks/Delta Live Tables quickstart (SQL)"
}
}
],
"target": "wikipedia_quickstart_data",
"continuous": false
}
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.
This example uses a .netrc file.
Request structure
See PipelineSettings.
Delete a pipeline
Endpoint |
HTTP Method |
---|---|
|
|
Deletes a pipeline from the Delta Live Tables system.
Example
This example deletes the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
:
Request
curl --netrc -X DELETE \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.
This example uses a .netrc file.
Start a pipeline update
Endpoint |
HTTP Method |
---|---|
|
|
Starts an update for a pipeline. You can start an update for the entire pipeline graph, or a selective update of specific tables.
Examples
Start a full refresh
This example starts an update with full refresh for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
:
Request
curl --netrc -X POST \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/updates \
--data '{ "full_refresh": "true" }'
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.
This example uses a .netrc file.
Start an update of selected tables
This example starts an update that refreshes the sales_orders_cleaned
and sales_order_in_chicago
tables in the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
:
Request
curl --netrc -X POST \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/updates \
--data '{ "refresh_selection": ["sales_orders_cleaned", "sales_order_in_chicago"] }'
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.
This example uses a .netrc file.
Start a full update of selected tables
This example starts an update of the sales_orders_cleaned
and sales_order_in_chicago
tables, and an update with full refresh of the customers
and sales_orders_raw
tables in the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
.
Request
curl --netrc -X POST \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/updates \
--data '{ "refresh_selection": ["sales_orders_cleaned", "sales_order_in_chicago"], "full_refresh_selection": ["customers", "sales_orders_raw"] }'
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.
This example uses a .netrc file.
Request structure
Field Name |
Type |
Description |
---|---|---|
|
|
Whether to reprocess all data. If This field is optional. The default value is An error is returned if |
|
An array of |
A list of tables to update. Use
This field is optional. If both
An error is returned if:
|
|
An array of |
A list of tables to update with full
refresh. Use This field is optional. If both
An error is returned if:
|
Get the status of a pipeline update request
Endpoint |
HTTP Method |
---|---|
|
|
Gets the status and information for the pipeline update associated with request_id
, where request_id
is a unique identifier for the request initiating the pipeline update. If the update is retried or restarted, then the new update inherits the request_id.
Example
For the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
, this example returns status and information for the update associated with request ID a83d9f7c-d798-4fd5-aa39-301b6e6f4429
:
Request
curl --netrc -X GET \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/requests/a83d9f7c-d798-4fd5-aa39-301b6e6f4429
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.
This example uses a .netrc file.
Response
{
"status": "TERMINATED",
"latest_update":{
"pipeline_id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
"update_id": "90da8183-89de-4715-b5a9-c243e67f0093",
"config":{
"id": "aae89b88-e97e-40c4-8e1a-1b7ac76657e8",
"name": "Retail sales (SQL)",
"storage": "/Users/username/data",
"configuration":{
"pipelines.numStreamRetryAttempts": "5"
},
"clusters":[
{
"label": "default",
"autoscale":{
"min_workers": 1,
"max_workers": 5,
"mode": "ENHANCED"
}
}
],
"libraries":[
{
"notebook":{
"path": "/Users/username/DLT Notebooks/Delta Live Tables quickstart (SQL)"
}
}
],
"continuous": false,
"development": true,
"photon": true,
"edition": "advanced",
"channel": "CURRENT"
},
"cause": "API_CALL",
"state": "COMPLETED",
"cluster_id": "1234-567891-abcde123",
"creation_time": 1664304117145,
"full_refresh": false,
"request_id": "a83d9f7c-d798-4fd5-aa39-301b6e6f4429"
}
}
Response structure
Field Name |
Type |
Description |
---|---|---|
|
|
The status of the pipeline update request. One of
|
|
|
The unique identifier of the pipeline. |
|
|
The unique identifier of the update. |
|
The pipeline settings. |
|
|
|
The trigger for the update. One of |
|
|
The state of the update. One of |
|
|
The identifier of the cluster running the update. |
|
|
The timestamp when the update was created. |
|
|
Whether this update resets all tables before running |
|
An array of |
A list of tables to update without full refresh. |
|
An array of |
A list of tables to update with full refresh. |
|
|
The unique identifier of the request that started the update.
This is the value returned by the update
request.
If the update is retried or restarted, then the new update
inherits the request_id. However, the |
Stop any active pipeline update
Endpoint |
HTTP Method |
---|---|
|
|
Stops any active pipeline update. If no update is running, this request is a no-op.
For a continuous pipeline, the pipeline execution is paused. Tables currently processing finish refreshing, but downstream tables are not refreshed. On the next pipeline update, Delta Live Tables performs a selected refresh of tables that did not complete processing, and resumes processing of the remaining pipeline DAG.
For a triggered pipeline, the pipeline execution is stopped. Tables currently processing finish refreshing, but downstream tables are not refreshed. On the next pipeline update, Delta Live Tables refreshes all tables.
Example
This example stops an update for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
:
Request
curl --netrc -X POST \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/stop
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.
This example uses a .netrc file.
List pipeline events
Endpoint |
HTTP Method |
---|---|
|
|
Retrieves events for a pipeline.
Example
This example retrieves a maximum of 5 events for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
.
Request
curl --netrc -X GET \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/events?max_results=5
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.
This example uses a .netrc file.
Request structure
Field Name |
Type |
Description |
---|---|---|
|
|
Page token returned by previous call. This field is mutually exclusive with all fields in this request except max_results. An error is returned if any fields other than max_results are set when this field is set. This field is optional. |
|
|
The maximum number of entries to return in a single page. The
system may return fewer than This field is optional. The default value is 25. The maximum value is 100. An error is returned if the value of
|
|
|
A string indicating a sort order by timestamp for the results,
for example, The sort order can be ascending or descending. By default, events are returned in descending order by timestamp. This field is optional. |
|
|
Criteria to select a subset of results, expressed using a SQL-like syntax. The supported filters are:
Composite expressions are supported, for example:
This field is optional. |
Get pipeline details
Endpoint |
HTTP Method |
---|---|
|
|
Gets details about a pipeline, including the pipeline settings and recent updates.
Example
This example gets details for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
:
Request
curl --netrc -X GET \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.
This example uses a .netrc file.
Response
{
"pipeline_id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
"spec": {
"id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
"name": "Wikipedia pipeline (SQL)",
"storage": "/Users/username/data",
"clusters": [
{
"label": "default",
"autoscale": {
"min_workers": 1,
"max_workers": 5,
"mode": "ENHANCED"
}
}
],
"libraries": [
{
"notebook": {
"path": "/Users/username/DLT Notebooks/Delta Live Tables quickstart (SQL)"
}
}
],
"target": "wikipedia_quickstart_data",
"continuous": false
},
"state": "IDLE",
"cluster_id": "1234-567891-abcde123",
"name": "Wikipedia pipeline (SQL)",
"creator_user_name": "username",
"latest_updates": [
{
"update_id": "8a0b6d02-fbd0-11eb-9a03-0242ac130003",
"state": "COMPLETED",
"creation_time": "2021-08-13T00:37:30.279Z"
},
{
"update_id": "a72c08ba-fbd0-11eb-9a03-0242ac130003",
"state": "CANCELED",
"creation_time": "2021-08-13T00:35:51.902Z"
},
{
"update_id": "ac37d924-fbd0-11eb-9a03-0242ac130003",
"state": "FAILED",
"creation_time": "2021-08-13T00:33:38.565Z"
}
],
"run_as_user_name": "username"
}
Response structure
Field Name |
Type |
Description |
---|---|---|
|
|
The unique identifier of the pipeline. |
|
The pipeline settings. |
|
|
|
The state of the pipeline. One of If state = |
|
|
The identifier of the cluster running the pipeline. |
|
|
The user-friendly name for this pipeline. |
|
|
The username of the pipeline creator. |
|
An array of UpdateStateInfo |
Status of the most recent updates for the pipeline, ordered with the newest update first. |
|
|
The username that the pipeline runs as. |
Get update details
Endpoint |
HTTP Method |
---|---|
|
|
Gets details for a pipeline update.
Example
This example gets details for update 9a84f906-fc51-11eb-9a03-0242ac130003
for the pipeline with ID a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5
:
Request
curl --netrc -X GET \
https://<databricks-instance>/api/2.0/pipelines/a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5/updates/9a84f906-fc51-11eb-9a03-0242ac130003
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.
This example uses a .netrc file.
Response
{
"update": {
"pipeline_id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
"update_id": "9a84f906-fc51-11eb-9a03-0242ac130003",
"config": {
"id": "a12cd3e4-0ab1-1abc-1a2b-1a2bcd3e4fg5",
"name": "Wikipedia pipeline (SQL)",
"storage": "/Users/username/data",
"configuration": {
"pipelines.numStreamRetryAttempts": "5"
},
"clusters": [
{
"label": "default",
"autoscale": {
"min_workers": 1,
"max_workers": 5,
"mode": "ENHANCED"
}
}
],
"libraries": [
{
"notebook": {
"path": "/Users/username/DLT Notebooks/Delta Live Tables quickstart (SQL)"
}
}
],
"target": "wikipedia_quickstart_data",
"continuous": false,
"development": false
},
"cause": "API_CALL",
"state": "COMPLETED",
"creation_time": 1628815050279,
"full_refresh": true,
"request_id": "a83d9f7c-d798-4fd5-aa39-301b6e6f4429"
}
}
Response structure
Field Name |
Type |
Description |
---|---|---|
|
|
The unique identifier of the pipeline. |
|
|
The unique identifier of this update. |
|
The pipeline settings. |
|
|
|
The trigger for the update. One of |
|
|
The state of the update. One of |
|
|
The identifier of the cluster running the pipeline. |
|
|
The timestamp when the update was created. |
|
|
Whether this was a full refresh. If true, all pipeline tables were reset before running the update. |
List pipelines
Endpoint |
HTTP Method |
---|---|
|
|
Lists pipelines defined in the Delta Live Tables system.
Example
This example retrieves details for pipelines where the name contains quickstart
:
Request
curl --netrc -X GET \
https://<databricks-instance>/api/2.0/pipelines?filter=name%20LIKE%20%27%25quickstart%25%27
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.
This example uses a .netrc file.
Response
{
"statuses": [
{
"pipeline_id": "e0f01758-fc61-11eb-9a03-0242ac130003",
"state": "IDLE",
"name": "DLT quickstart (Python)",
"latest_updates": [
{
"update_id": "ee9ae73e-fc61-11eb-9a03-0242ac130003",
"state": "COMPLETED",
"creation_time": "2021-08-13T00:34:21.871Z"
}
],
"creator_user_name": "username"
},
{
"pipeline_id": "f4c82f5e-fc61-11eb-9a03-0242ac130003",
"state": "IDLE",
"name": "My DLT quickstart example",
"creator_user_name": "username"
}
],
"next_page_token": "eyJ...==",
"prev_page_token": "eyJ..x9"
}
Request structure
Field Name |
Type |
Description |
---|---|---|
|
|
Page token returned by previous call. This field is optional. |
|
|
The maximum number of entries to return in a single page. The
system may return fewer than This field is optional. The default value is 25. The maximum value is 100. An error is returned if the value of
|
|
An array of |
A list of strings specifying the order of results, for example,
This field is optional. |
|
|
Select a subset of results based on the specified criteria. The supported filters are:
Composite filters are not supported. This field is optional. |
Response structure
Field Name |
Type |
Description |
---|---|---|
|
An array of PipelineStateInfo |
The list of events matching the request criteria. |
|
|
If present, a token to fetch the next page of events. |
|
|
If present, a token to fetch the previous page of events. |
Data structures
In this section:
AwsAttributes
Attributes set during cluster creation related to Amazon Web Services.
Field Name |
Type |
Description |
---|---|---|
|
|
The first first_on_demand nodes of the cluster will be placed on on-demand instances.
If this value is greater than 0, the cluster driver node will be placed on an
on-demand instance. If this value is greater than or equal to the current cluster size, all
nodes will be placed on on-demand instances. If this value is less than the current cluster
size, first_on_demand nodes will be placed on on-demand instances and the remainder will
be placed on |
|
Availability type used for all subsequent nodes past the first_on_demand ones. Note: If first_on_demand is zero, this availability type will be used for the entire cluster. |
|
|
|
Identifier for the availability zone (AZ) in which the cluster resides. By default, the setting has a value of auto, otherwise known as Auto-AZ. With Auto-AZ, Databricks selects the AZ based on available IPs in the workspace subnets and retries in other availability zones if AWS returns insufficient capacity errors. If you want, you can also specify an availability zone to use. This benefits accounts that have reserved
instances in a specific AZ. Specify the AZ as a string (for example, The list of available zones as well as the default value can be found by using the GET /api/2.0/clusters/list-zones call. |
|
|
Nodes for this cluster will only be placed on AWS instances with this instance profile. If omitted, nodes will be placed on instances without an instance profile. The instance profile must have previously been added to the Databricks environment by an account administrator. This feature may only be available to certain customer plans. |
|
|
The max price for AWS spot instances, as a percentage of the corresponding instance type’s
on-demand price.
For example, if this field is set to 50, and the cluster needs a new |
|
The type of EBS volumes that will be launched with this cluster. |
|
|
|
The number of volumes launched for each instance. You can choose up to 10 volumes. This feature is only enabled for supported node types. Legacy node types cannot specify custom EBS volumes. For node types with no instance store, at least one EBS volume needs to be specified; otherwise, cluster creation will fail. These EBS volumes will be mounted at If EBS volumes are attached, Databricks will configure Spark to use only the EBS volumes for scratch storage because heterogeneously sized scratch devices can lead to inefficient disk utilization. If no EBS volumes are attached, Databricks will configure Spark to use instance store volumes. If EBS volumes are specified, then the Spark configuration |
|
|
The size of each EBS volume (in GiB) launched for each instance. For general purpose SSD, this value must be within the range 100 - 4096. For throughput optimized HDD, this value must be within the range 500 - 4096. Custom EBS volumes cannot be specified for the legacy node types (memory-optimized and compute-optimized). |
|
|
The number of IOPS per EBS gp3 volume. This value must be between 3000 and 16000. The value of IOPS and throughput is calculated based on AWS documentation to match the maximum performance of a gp2 volume with the same volume size. For more information, see the EBS volume limit calculator. |
|
|
The throughput per EBS gp3 volume, in MiB per second. This value must be between 125 and 1000. |
If neither ebs_volume_iops
nor ebs_volume_throughput
is specified, the values are inferred from the disk size:
Disk size |
IOPS |
Throughput |
---|---|---|
Greater than 1000 |
3 times the disk size, up to 16000 |
250 |
Between 170 and 1000 |
3000 |
250 |
Below 170 |
3000 |
125 |
AwsAvailability
The set of AWS availability types supported when setting up nodes for a cluster.
Type |
Description |
---|---|
|
Use spot instances. |
|
Use on-demand instances. |
|
Preferably use spot instances, but fall back to on-demand instances if spot instances cannot be acquired (for example, if AWS spot prices are too high). |
ClusterLogConf
Path to cluster log.
Field Name |
Type |
Description |
---|---|---|
|
DBFS location of cluster log. Destination must be provided. For example,
S3 location of cluster log.
|
DbfsStorageInfo
DBFS storage information.
Field Name |
Type |
Description |
---|---|---|
|
|
DBFS destination. Example: |
EbsVolumeType
Databricks supports gp2 and gp3 EBS volume types. Follow the instructions at Manage SSD storage to select gp2 or gp3 for your workspace.
Type |
Description |
---|---|
|
Provision extra storage using AWS EBS volumes. |
|
Provision extra storage using AWS st1 volumes. |
FileStorageInfo
File storage information.
Note
This location type is only available for clusters set up using Databricks Container Services.
Field Name |
Type |
Description |
---|---|---|
|
|
File destination. Example: |
InitScriptInfo
Path to an init script.
For instructions on using init scripts with Databricks Container Services, see Use an init script.
Note
The file storage type (field name: file
) is only available for clusters set up using Databricks Container Services. See FileStorageInfo.
Important
Init scripts are not supported with Unity Catalog-enabled pipelines. See Limitations.
Field Name |
Type |
Description |
---|---|---|
OR
|
DbfsStorageInfo (deprecated) |
Workspace location of init script. Destination must be provided. For example,
(Deprecated) DBFS location of init script. Destination must be provided. For example,
S3 location of init script.
Destination and either region or warehouse must be provided. For example,
|
KeyValue
A key-value pair that specifies configuration parameters.
Field Name |
Type |
Description |
---|---|---|
|
|
The configuration property name. |
|
|
The configuration property value. |
NotebookLibrary
A specification for a notebook containing pipeline code.
Field Name |
Type |
Description |
---|---|---|
|
|
The absolute path to the notebook. This field is required. |
PipelinesAutoScale
Attributes defining an autoscaling cluster.
Field Name |
Type |
Description |
---|---|---|
|
|
The minimum number of workers to which the cluster can scale down when underutilized. It is also the initial number of workers the cluster will have after creation. |
|
|
The maximum number of workers to which the cluster can scale up when overloaded. max_workers must be strictly greater than min_workers. |
|
|
The autoscaling mode for the cluster:
|
PipelineLibrary
A specification for pipeline dependencies.
Field Name |
Type |
Description |
---|---|---|
|
The path to a notebook defining Delta Live Tables datasets. The path must be in the
Databricks workspace, for example:
|
PipelinesNewCluster
A pipeline cluster specification.
The Delta Live Tables system sets the following attributes. These attributes cannot be configured by users:
spark_version
Field Name |
Type |
Description |
---|---|---|
|
|
A label for the cluster specification, either
This field is optional. The default value is |
|
An object containing a set of optional, user-specified Spark configuration key-value pairs.
You can also pass in a string of extra JVM options to the driver and the executors via
Example Spark confs:
|
|
|
Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used. |
|
|
|
This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads A list of available node types can be retrieved by using the GET 2.0/clusters/list-node-types call. |
|
|
The node type of the Spark driver.
This field is optional; if unset, the driver node type will be set as the same value
as |
|
An array of |
SSH public key contents that will be added to each Spark node in this cluster. The
corresponding private keys can be used to login with the user name |
|
An object containing a set of tags for cluster resources. Databricks tags all cluster resources with these tags in addition to default_tags. Note:
|
|
|
The configuration for delivering Spark logs to a long-term storage destination.
Only one destination can be specified for one cluster.
If this configuration is provided, the logs will be delivered to the destination every
|
|
|
An object containing a set of optional, user-specified environment variable key-value pairs.
Key-value pairs of the form (X,Y) are exported as is (that is,
In order to specify an additional set of Example Spark environment variables:
|
|
|
An array of InitScriptInfo |
The configuration for storing init scripts. Any number of destinations can be specified.
The scripts are executed sequentially in the order provided.
If |
|
|
The optional ID of the instance pool to which the cluster belongs. See Create a pool. |
|
|
The optional ID of the instance pool to use for the driver node. You must also specify
|
|
|
A cluster policy ID. |
|
|
If num_workers, number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes. When reading the properties of a cluster, this field reflects the desired number of workers rather than the actual number of workers. For instance, if a cluster is resized from 5 to 10 workers, this field is updated to reflect the target size of 10 workers, whereas the workers listed in executors gradually increase from 5 to 10 as the new nodes are provisioned. If autoscale, parameters needed to automatically scale clusters up and down based on load. This field is optional. |
|
|
Whether to use policy default values for missing cluster attributes. |
PipelineSettings
The settings for a pipeline deployment.
Field Name |
Type |
Description |
---|---|---|
|
|
The unique identifier for this pipeline. The identifier is created by the Delta Live Tables system, and must not be provided when creating a pipeline. |
|
|
A user-friendly name for this pipeline. This field is optional. By default, the pipeline name must be unique. To use a
duplicate name, set |
|
|
A path to a DBFS directory for storing checkpoints and tables created by the pipeline. This field is optional. The system uses a default location if this field is empty. |
|
A map of |
A list of key-value pairs to add to the Spark configuration of the cluster that will run the pipeline. This field is optional. Elements must be formatted as key:value pairs. |
|
An array of PipelinesNewCluster |
An array of specifications for the clusters to run the pipeline. This field is optional. If this is not specified, the system will select a default cluster configuration for the pipeline. |
|
An array of PipelineLibrary |
The notebooks containing the pipeline code and any dependencies required to run the pipeline. |
|
|
A database name for persisting pipeline output data. See Publish data from Delta Live Tables pipelines to the Hive metastore for more information. |
|
|
Whether this is a continuous pipeline. This field is optional. The default value is |
|
|
Whether to run the pipeline in development mode. This field is optional. The default value is |
|
|
Whether Photon acceleration is enabled for this pipeline. This field is optional. The default value is |
|
|
The Delta Live Tables release channel specifying the runtime version to use for this pipeline. Supported values are:
This field is optional. The default value is |
|
|
The Delta Live Tables product edition to run the pipeline:
This field is optional. The default value is |
PipelineStateInfo
The state of a pipeline, the status of the most recent updates, and information about associated resources.
Field Name |
Type |
Description |
---|---|---|
|
|
The state of the pipeline. One of |
|
|
The unique identifier of the pipeline. |
|
|
The unique identifier of the cluster running the pipeline. |
|
|
The user-friendly name of the pipeline. |
|
An array of UpdateStateInfo |
Status of the most recent updates for the pipeline, ordered with the newest update first. |
|
|
The username of the pipeline creator. |
|
|
The username that the pipeline runs as. This is a read only value derived from the pipeline owner. |
S3StorageInfo
S3 storage information.
Field Name |
Type |
Description |
---|---|---|
|
|
S3 destination. For example: |
|
|
S3 region. For example: |
|
|
S3 warehouse. For example: |
|
|
(Optional)Enable server side encryption, |
|
|
(Optional) The encryption type, it could be |
|
|
(Optional) KMS key used if encryption is enabled and encryption type is set to |
|
|
(Optional) Set canned access control list. For example: |
UpdateStateInfo
The current state of a pipeline update.
Field Name |
Type |
Description |
---|---|---|
|
|
The unique identifier for this update. |
|
|
The state of the update. One of |
|
|
Timestamp when this update was created. |
WorkspaceStorageInfo
Workspace storage information.
Field Name |
Type |
Description |
---|---|---|
|
|
File destination. Example: |