If you’ve worked through each section of this guide, you are well on your way to building your own Apache Spark applications on Databricks.
Your first next step should be Spark: The Definitive Guide. Written by the creator of the open-source cluster-computing framework, this comprehensive guide teaches you how to use, deploy, and maintain Apache Spark. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. The notebooks from the guide are available on GitHub and the datasets are available in the DBFS folder
There are many more resources within the Databricks documentation and the Databricks website; we recommend that you check these out at your leisure:
- Databricks offers many training options, both self paced and instructor led.
- Apache Spark is constantly evolving. For the newest developments, see the Databricks Engineering blog.
- Social Media
- To stay up to date on the latest improvements and tips from the team that created Apache Spark, follow Databricks on Twitter and Facebook.
- The Databricks forum is a great resource to ask questions about Apache Spark and the Databricks product. Anyone can sign up and participate in discussions there.
- Subscribe to the Databricks newsletter.
For in-depth documentation on various Apache Spark APIs, see: