To save cluster resources, you can terminate a cluster. A terminated cluster cannot run notebooks or jobs, but its configuration is stored so that it can be reused at a later time. You can manually terminate a cluster or configure the cluster to automatically terminate after a specified period of inactivity. Databricks records information whenever a cluster is terminated.
Databricks retains the configuration information for up to 70 interactive clusters terminated in the last 30 days and up to 30 job clusters recently terminated by the job scheduler. To keep an interactive cluster configuration even after it has been terminated for more than 30 days, an administrator can pin a cluster to the cluster list.
You can manually terminate a cluster from the
Cluster detail page
You can also set auto termination for a cluster. During cluster creation, you can specify an inactivity period in minutes after which you want the cluster to terminate. If the difference between the current time and the last command run on the cluster is more than the inactivity period specified, Databricks automatically terminates that cluster.
A cluster is considered inactive when all commands on the cluster, including Spark jobs, Structured Streaming, and JDBC calls, have finished executing. This does not include commands run by SSH-ing into the cluster and running bash commands.
- Clusters do not report activity resulting from the use of DStreams. This means that an autoterminating cluster may be terminated while it is running DStreams. Turn off auto termination for clusters running DStreams or consider using Structured Streaming.
- The auto termination feature monitors only Spark jobs, not user-defined local processes. Therefore, if all Spark jobs have completed, a cluster may be terminated even if local processes are running.
You configure automatic termination in the Auto Termination field in the Autopilot Options box on the cluster creation page:
The default value of the auto terminate setting depends on whether you choose to create a standard or high concurrency cluster:
- Standard clusters are configured to automatically terminate after 120 minutes.
- High concurrency clusters are configured to not terminate automatically.
You can opt out of auto termination by clearing the Auto Termination checkbox or by specifying an inactivity period of
Auto termination is best supported in the latest Spark versions. Older Spark versions have known limitations which may result in inaccurate reporting of cluster activity. For example, clusters running JDBC, R, or streaming commands may report a stale activity time which will lead to premature cluster termination. You are strongly recommended to upgrade to the most recent Spark version to benefit from bug fixes and improvements to auto termination.