DataStreamWriter
Interface used to write a streaming DataFrame to external storage systems (for example, file systems and key-value stores). Use df.writeStream to access this.
Syntax
# Access through DataFrame
df.writeStream
Methods
Method | Description |
|---|---|
Specifies how data of a streaming DataFrame is written to the sink. Options are | |
Specifies the output data source format. | |
Adds an output option for the underlying data source. | |
Adds multiple output options for the underlying data source. | |
Partitions the output by the given columns on the file system. | |
Clusters the output by the given columns. | |
Specifies the name of the streaming query. | |
Sets the trigger for the streaming query execution. | |
Sets the output of the streaming query to be processed by the given function or object. | |
Sets the output of each microbatch to be processed by the given function. | |
Starts the execution of the streaming query and returns a | |
Alias for | |
Starts the execution of the streaming query, continually outputting results to the given table. |
Examples
Load a rate stream, apply a transformation, write to the console, and stop after 3 seconds.
import time
df = spark.readStream.format("rate").load()
df = df.selectExpr("value % 3 as v")
q = df.writeStream.format("console").start()
time.sleep(3)
q.stop()