Skip to main content

Use ALTER statements with pipeline datasets

Preview

This feature is in Public Preview.

Lakeflow Spark Declarative Pipelines (SDP) defines pipelines in source code that is specific to SDP. You can edit pipeline source in either SQL or Python, for example, in the Lakeflow Pipelines Editor.

Databricks also provides a SQL environment called Databricks SQL. You can create materialized views and streaming tables with Databricks SQL using pipeline functionality outside of SDP (see Use pipelines in Databricks SQL). Typically, Databricks SQL is not used with Lakeflow Spark Declarative Pipelines.

However you can use ALTER SQL statements in Databricks SQL to modify the properties of a dataset created with either SDP or Databricks SQL. Use these SQL statements from any Databricks SQL environment, whether you are modifying SDP datasets, or Databricks SQL pipeline datasets.

note

You can't modify the schedule or trigger of a dataset defined in SDP with an ALTER statement.

Limitation: Pipeline updates and changes made with ALTER

There are cases where ALTER statements conflict with the definition of the pipeline-created datasets. The SQL that defines a table or view in a pipeline is re-run on each update. This can undo changes you make with an ALTER statement.

For example, if you have a SQL statement that defines a materialized view, like the following:

SQL
CREATE OR REPLACE MATERIALIZED VIEW masked_view (
id int,
name string,
region string,
ssn string MASK catalog.schema.ssn_mask_fn
)
WITH ROW FILTER catalog.schema.us_filter_fn ON (region)
AS SELECT id, name, region, ssn
FROM employees;

Then you try to remove the mask from the ssn column using an ALTER statement, like this:

SQL
ALTER MATERIALIZED VIEW masked_view ALTER COLUMN ssn DROP MASK;

The mask is removed, but the next time the materialized view is updated the SQL definition adds it back.

To safely remove the mask, you must edit the SQL definition to remove the mask and then run the ALTER command to DROP the mask.

note

To edit the definition of a pipeline defined in SDP, edit your pipeline source using the pipeline editor. To edit the definition of a pipeline defined in Databricks SQL, run the modified SQL statement in any Databricks SQL environment.

Additional resources