Learn how to use the
horovod.spark package to perform distributed training of machine learning models.
Databricks supports the
horovod.spark package, which provides an estimator API that you can use in ML pipelines with Keras and PyTorch. For details, see Horovod on Spark, which includes a section on Horovod on Databricks.
Databricks installs the
horovodpackage with dependencies. If you upgrade or downgrade these dependencies, there might be compatibility issues.
horovod.sparkwith custom callbacks in Keras, you must save models in the TensorFlow SavedModel format.
With TensorFlow 2.x, use the
.tfsuffix in the file name.
With TensorFlow 1.x, set the option
Here is a basic example to run a distributed training function using
import horovod.tensorflow as hvd