Launching Distributed Model Training

The notebook below is the fourth of six notebooks demonstrating how to perform distributed training with TensorFlowOnSpark on the MNIST dataset; Download the full set of notebooks or see the TensorFlowOnSpark guide for more information. The notebook uses the helpers defined in the previous notebooks to download data from S3 and build the model graph, then calls TensorFlowOnSpark APIs to launch distributed model training on the Spark workers.

Next steps

  • See Model Evaluation to learn how to run model evaluation concurrently with model training.
  • See TensorBoard to learn how to visualize model training & validation performance metrics (for example, loss, accuracy) in real time with TensorBoard.