Limitations with Databricks Connect for Python
Note
This article covers Databricks Connect for Databricks Runtime 13.3 LTS and above.
This article lists limitations with Databricks Connect for Python. Databricks Connect enables you to connect popular IDEs, notebook servers, and custom applications to Databricks clusters. See What is Databricks Connect?. For the Scala version of this article, see Limitations with Databricks Connect for Scala.
Not available on Databricks Connect for Databricks Runtime 13.3 LTS and below:
Streaming
foreachBatch
Creating DataFrames larger than 128 MB
Long queries over 3600 seconds
Not available:
Dataset API
Dataset typed APIs (such as
reduce()
andflatMap()
)Databricks Utilities:
credentials
,library
,notebook workflow
,widgets
SparkContext
RDDs
MLflow model inference:
pyfunc.spark_udf()
APIMosaic geospatial
CREATE TABLE <table-name> AS SELECT
(instead, usespark.sql("SELECT ...").write.saveAsTable("table")
)ApplyinPandas()
andCogroup()
with shared clustersChanging the log4j log level through
SparkContext
Distributed ML training
Synchronizing the local development environment with the remote cluster