Example Notebooks

Amazon CloudTrail ETL

The following notebooks show how you can easily transform your Amazon CloudTrail logs from JSON into Parquet for efficient ad-hoc querying. See Real-time Streaming ETL with Structured Streaming for details.

ETL of Amazon CloudTrail Logs using Structured Streaming in Python

ETL of Amazon CloudTrail Logs using Structured Streaming in Scala

Stream-Stream Joins

These two notebooks show how to use stream-stream joins in Python and Scala.

Stream-Stream Joins in Python

Stream-Stream Joins in Scala