Skip to main content

toTable (DataStreamWriter)

Starts the execution of the streaming query, continually outputting results to the given table as new data arrives. Returns a StreamingQuery object.

Syntax

toTable(tableName, format=None, outputMode=None, partitionBy=None, queryName=None, **options)

Parameters

Parameter

Type

Description

tableName

str

Name of the table.

format

str, optional

The format used to save.

outputMode

str, optional

How data is written to the sink: append, complete, or update.

partitionBy

str or list, optional

Names of partitioning columns. Ignored for v2 tables that already exist.

queryName

str, optional

Unique name for the query.

**options

-

All other string options. Provide a checkpointLocation for most streams.

Returns

StreamingQuery

Notes

For v1 tables, partitionBy columns are always respected. For v2 tables, partitionBy is only respected if the table does not yet exist.

Examples

Save a data stream to a table:

Python
import tempfile
import time
_ = spark.sql("DROP TABLE IF EXISTS my_table2")
with tempfile.TemporaryDirectory(prefix="toTable") as d:
q = spark.readStream.format("rate").option(
"rowsPerSecond", 10).load().writeStream.toTable(
"my_table2",
queryName='that_query',
outputMode="append",
format='parquet',
checkpointLocation=d)
time.sleep(3)
q.stop()
spark.read.table("my_table2").show()
_ = spark.sql("DROP TABLE my_table2")