Skip to main content

StreamingQuery

A handle to a query that is executing continuously in the background as new data arrives. All methods are thread-safe.

Syntax

Python
# Returned by DataStreamWriter.start() or DataStreamWriter.toTable()
q = df.writeStream.format("console").start()

Properties

Property

Description

id

Returns the unique ID of this query that persists across restarts from checkpoint data.

runId

Returns the unique ID of this query that does not persist across restarts.

name

Returns the user-specified name of the query, or None if not specified.

isActive

Returns whether this streaming query is currently active.

status

Returns the current status of the query as a dict.

recentProgress

Returns an array of the most recent StreamingQueryProgress updates for this query.

lastProgress

Returns the most recent StreamingQueryProgress update, or None if there have been no updates.

Methods

Method

Description

awaitTermination(timeout)

Waits for the termination of this query, either by stop() or by an exception.

processAllAvailable()

Blocks until all available data in the source has been processed and committed to the sink. Intended for testing.

stop()

Stops this streaming query.

explain(extended)

Prints the (logical and physical) plans to the console for debugging.

exception()

Returns the StreamingQueryException if the query terminated with an exception, or None.

Examples

Python
sdf = spark.readStream.format("rate").load()
sq = sdf.writeStream.format('memory').queryName('this_query').start()
sq.isActive
# True
sq.name
# 'this_query'
sq.awaitTermination(5)
# False
sq.stop()
sq.isActive
# False