Prepare Storage for Data Loading and Model Checkpointing

Data loading and model checkpointing are crucial to deep learning (especially distributed DL) workloads.

In Databricks Runtime 6.0 and above, Databricks provides a high performance FUSE mount.

In Databricks Runtime 5.4 to Databricks Runtime 5.5, Databricks provides dbfs:/ml, a special folder that offers high-performance I/O for deep learning workloads, that maps to file:/dbfs/ml on driver and worker nodes. Databricks recommends using Databricks Runtime 5.4 or above and saving data under /dbfs/ml. This FUSE mount also alleviates the local file I/O API limitation in Databricks Runtime of supporting only files smaller than 2GB.

If you use a Databricks Runtime version lower than 5.4 or you want to use your own storage, Databricks recommends that you use the Goofys client, a high-performance, POSIX-ish Amazon S3 file system. Databricks hosts a customized Goofys binary that contains optimizations specific to deep learning workloads. To mount an S3 bucket as a file system with Goofys, you can use an init script. The following notebook explains how to generate an init script and configure a cluster to run the script.

goofys init script notebook