Skip to main content

start (DataStreamWriter)

Streams the contents of the DataFrame to a data source and returns a StreamingQuery object.

Syntax

start(path=None, format=None, outputMode=None, partitionBy=None, queryName=None, **options)

Parameters

Parameter

Type

Description

path

str, optional

Path in a Hadoop-supported file system.

format

str, optional

The format used to save.

outputMode

str, optional

How data is written to the sink: append, complete, or update.

partitionBy

str or list, optional

Names of partitioning columns.

queryName

str, optional

Unique name for the query.

**options

-

All other string options. Provide checkpointLocation for most streams; not required for a memory stream.

Returns

StreamingQuery

Examples

Python
df = spark.readStream.format("rate").load()

Basic example:

Python
q = df.writeStream.format('memory').queryName('this_query').start()
q.isActive
# True
q.name
# 'this_query'
q.stop()
q.isActive
# False

With a trigger and additional parameters:

Python
q = df.writeStream.trigger(processingTime='5 seconds').start(
queryName='that_query', outputMode="append", format='memory')
q.name
# 'that_query'
q.isActive
# True
q.stop()