CACHE TABLE
Applies to: Databricks Runtime
Caches contents of a table or output of a query with the given storage level in Apache Spark cache. If a query is cached, then a temp view is created for this query. This reduces scanning of the original files in future queries.
Syntax
CACHE [ LAZY ] TABLE table_name
[ OPTIONS ( 'storageLevel' [ = ] value ) ] [ [ AS ] query ]
See Disk cache vs. Spark cache for the differences between disk caching and the Apache Spark cache.
Parameters
-
LAZY
Only cache the table when it is first used, instead of immediately.
-
Identifies the Delta table or view to cache. The name must not include a temporal specification or options specification. If the table cannot be found Databricks raises a TABLE_OR_VIEW_NOT_FOUND error.
-
OPTIONS ( 'storageLevel' [ = ] value )
OPTIONSclause withstorageLevelkey and value pair. A warning is issued when a key other thanstorageLevelis used. The valid options forstorageLevelare:NONEDISK_ONLYDISK_ONLY_2MEMORY_ONLYMEMORY_ONLY_2MEMORY_ONLY_SERMEMORY_ONLY_SER_2MEMORY_AND_DISKMEMORY_AND_DISK_2MEMORY_AND_DISK_SERMEMORY_AND_DISK_SER_2OFF_HEAP
An Exception is thrown when an invalid value is set for
storageLevel. IfstorageLevelis not explicitly set usingOPTIONSclause, the defaultstorageLevelis set toMEMORY_AND_DISK. -
query
A query that produces the rows to be cached. It can be in one of following formats:
- A
SELECTstatement - A
TABLEstatement - A
FROMstatement
- A
Examples
> CACHE TABLE testCache OPTIONS ('storageLevel' 'DISK_ONLY') SELECT * FROM testData;