Expectations

This page contains Python reference documentation for pipeline expectations.

Expectation decorators declare data quality constraints on materialized views, streaming tables, or temporary views created in a pipeline.

The dp module includes six decorators to control expectations behavior. The following table describes the dimensions on which these permutations differ:

Behavior	Options
Action on violation	Include the row in the target dataset. The count of valid and invalid records is logged alongside other dataset metrics. Drop the row before writing to the target dataset. The count of dropped records is logged alongside other dataset metrics. Immediately stop the update. This expectation causes a failure of a single flow and does not cause other flows in your pipeline to fail.
Number of expectations	A single expectation or multiple expectations.

You can add multiple expectation decorators to your datasets, providing flexibility in strictness for your data quality constraints.

When you use expect_all decorators, each expectation has its own description and reports granular metrics.

Syntax

Expectation decorators come after a @dp.table(), @dp.materialized_view or @dp.temporary_view() decorator and before a dataset definition function, as in the following example:

Python
from pyspark import pipelines as dp

@dp.table()
@dp.expect(description, constraint)
@dp.expect_or_drop(description, constraint)
@dp.expect_or_fail(description, constraint)
@dp.expect_all({description: constraint, ...})
@dp.expect_all_or_drop({description: constraint, ...})
@dp.expect_all_or_fail({description: constraint, ...})
def <function-name>():
    return (<query>)

Parameters

Parameter	Type	Description
`description`	`str`	Required. A description that identifies the constraint. Constraint descriptions must be unique for each dataset.
`constraint`	`str`	Required. The constraint clause is a SQL conditional statement that must evaluate to `true` or `false` for each record. The constraint contains the actual logic for what is being validated. When a record fails this condition, the expectation is triggered.

The expect_all decorators require descriptions and constraints to be passed as a dict of key-value pairs.

Syntax​

Parameters​

Syntax

Parameters