Third-Party Machine Learning Integrations¶
We recommend Spark MLLib as the first library customers should use because it seamlessly integrates with other components of Spark such as Spark SQL, Spark Streaming, and DataFrames. Though Databricks comes with Spark MLlib pre-installed, data scientists may want to use third-party machine learning libraries and frameworks in their data pipelines.
In this section, we provide instructions for how to install, configure and run some of these third-party ML tools in Databricks.
Databricks provides these examples on a best-effort basis. Because they are external libraries, they may change in ways that Databricks cannot predict. If you need additional support on third-party tools, please refer to the documentation, mailing lists or other support options provided by the library vendor or maintainer directly.
H2O Sparkling Water¶
This notebook is too large to display inline. Please click this link to view and import this notebook!