Examples

Amazon CloudTrail ETL

The following notebooks show how you can easily transform your Amazon CloudTrail logs into from JSON into Parquet for efficient ad-hoc querying. See Real-time Streaming ETL with Structured Streaming for details.

ETL of Amazon CloudTrail Logs using Structured Streaming in Python

ETL of Amazon CloudTrail Logs using Structured Streaming in Scala

Election Tweets

This notebook shows you how to analyze tweets using Structured Streaming.

You can see the full video here.

Analyzing Election Tweets in Python

This notebook is too large to display inline. Get notebook link.

Stream-Stream Joins

These two notebooks show how to use stream-stream joins in Python and Scala.

Stream-Stream Joins in Python

Stream-Stream Joins in Scala