Compute access mode limitations for Unity Catalog

Databricks recommends using Unity Catalog and shared access mode for most workloads. This article outlines various limitations for each access mode with Unity Catalog. For details on access modes, see Access modes.

Databricks recommends using compute policies to simplify configuration options for most users. See Create and manage compute policies.

Note

No-isolation shared is a legacy access mode that does not support Unity Catalog.

Important

Init scripts and libraries have different support across access modes and Databricks Runtime versions. See Compute compatibility with libraries and init scripts.

Single user access mode limitations on Unity Catalog

Single user access mode on Unity Catalog has the following limitations. These are in addition to the general limitations for all Unity Catalog access mode. See General limitations for Unity Catalog.

Fine-grained access control limitations for Unity Catalog single user access mode

  • Dynamic views are not supported.

  • To read from a view, you must have SELECT on all referenced tables and views.

  • You cannot access a table that has a row filter or column mask.

  • You cannot use a single user cluster to query tables created by a Unity Catalog-enabled Delta Live Tables pipeline, including streaming tables and materialized views created in Databricks SQL. To query tables created by a Delta Live Tables pipeline, you must use a shared cluster using Databricks Runtime 13.1 and above.

Streaming limitations for Unity Catalog single user access mode

  • Asynchronous checkpointing is not supported in Databricks Runtime 11.3 LTS and below.

Shared access mode limitations on Unity Catalog

Shared access mode on Unity Catalog has the following limitations. These are in addition to the general limitations for all Unity Catalog access mode. See General limitations for Unity Catalog.

  • Databricks Runtime ML and Spark Machine Learning Library (MLlib) are not supported.

  • Spark-submit jobs are not supported.

  • When used with credential passthrough, Unity Catalog features are disabled.

  • Custom containers are not supported.

Language support for Unity Catalog shared access mode

  • R is not supported.

  • Scala is supported on Databricks Runtime 13.3 and above.

Spark API limitations for Unity Catalog shared access mode

  • RDD APIs are not supported.

  • DBUtils and other clients that directly read the data from cloud storage are not supported.

  • Spark Context (sc),spark.sparkContext, and sqlContext are not supported for Scala in any Databricks Runtime and are not supported for Python in Databricks Runtime 14.0 and above.

    • Databricks recommends using the spark variable to interact with the SparkSession instance.

    • The following sc functions are also not supported: emptyRDD, range, init_batched_serializer, parallelize, pickleFile, textFile, wholeTextFiles, binaryFiles, binaryRecords, sequenceFile, newAPIHadoopFile, newAPIHadoopRDD, hadoopFile, hadoopRDD, union, runJob, setSystemProperty, uiWebUrl, stop, setJobGroup, setLocalProperty, getConf.

UDF limitations for Unity Catalog shared access mode

Preview

Support for Scala UDFs on Unity Catalog-enabled clusters with shared access mode is in Public Preview.

User-defined functions (UDFs) have the following limitations with shared access mode:

  • Hive UDFs are not supported.

  • applyInPandas and mapInPandas are not supported.

  • Access to Unity Catalog volumes using Python or Scala UDFs is not supported.

  • In Databricks Runtime 14.2 and above, Scala scalar UDFs are supported. Other Scala UDFs and UDAFs are not supported.

  • In Databricks Runtime 13.2 and above, Python scalar UDFs and Pandas UDFs are supported. Other Python UDFs, including UDAFs, UDTFs, and Pandas on Spark are not supported.

See User-defined functions (UDFs) in Unity Catalog.

Streaming limitations for Unity Catalog shared access mode

Note

Some of the listed Kafka options have limited support when used for supported configurations on Databricks. See Stream processing with Apache Kafka and Databricks.

  • For Scala, foreach and foreachBatch are not supported.

  • For Scala, from_avro requires Databricks Runtime 14.2 or above.

  • applyInPandasWithState is not supported.

  • Working with socket sources is not supported.

  • The sourceArchiveDir must be in the same external location as the source when you use option("cleanSource", "archive") with a data source managed by Unity Catalog.

  • For Kafka sources and sinks, the following options are unsupported:

    • kafka.sasl.client.callback.handler.class

    • kafka.sasl.login.callback.handler.class

    • kafka.sasl.login.class

    • kafka.partition.assignment.strategy

  • The following Kafka options are supported in Databricks Runtime 13.0 but unsupported in Databricks Runtime 12.2 LTS. You can only specify external locations managed by Unity Catalog for these options:

    • kafka.ssl.truststore.location

    • kafka.ssl.keystore.location

  • You cannot use instance profiles to configure access to external sources such as Kafka or Kinesis for streaming workloads in shared access mode.

Network and file system access limitations for Unity Catalog shared access mode

  • Must run commands on cluster nodes as a low-privilege user forbidden from accessing sensitive parts of the filesystem.

  • In Databricks Runtime 11.3 LTS and below, you can only create network connections to ports 80 and 443.

  • Cannot connect to the instance metadata service (IMDS), other EC2 instances, or any other services running in the Databricks VPC. This prevents access to any service that uses the IMDS, such as boto3 and the AWS CLI.

General limitations for Unity Catalog

The following limitations apply to all Unity Catalog-enabled access modes.

UDFs

Graviton instances do not support UDFs on Unity Catalog-enabled clusters. Additional limitations exist for shared access mode. See UDF limitations for Unity Catalog shared access mode.

Streaming limitations for Unity Catalog

  • Apache Spark continuous processing mode is not supported. See Continuous Processing in the Spark Structured Streaming Programming Guide.

  • StreamingQueryListener cannot use credentials or interact with objects managed by Unity Catalog.

See also Streaming limitations for Unity Catalog single user access mode and Streaming limitations for Unity Catalog shared access mode.

For more on streaming with Unity Catalog, see Using Unity Catalog with Structured Streaming.

Volumes limitations

You must use Unity Catalog-enabled compute to interact with Unity Catalog volumes. Volumes do not support all workloads. There is no support for Scala UDFs with volumes.

Support for accessing data in volumes using the /Volumes/catalog/schema/volume path is limited to the following:

  • Spark SQL

  • Spark DataFrames

  • dbutils.fs commands

Note

Volumes do not support dbutils.fs commands distributed to executors.

For more volumes limitations, see Limitations.