Skip to main content

Databricks Runtime 17.0 (Beta)

Beta

Databricks Runtime 17.0 is in Beta. The contents of the supported environments might change during the Beta. Changes can include the list of packages or versions of installed packages.

The following release notes provide information about Databricks Runtime 17.0 (Beta), powered by Apache Spark 4.0.0.

Databricks released this beta version in May 2025.

tip

To see release notes for Databricks Runtime versions that have reached end-of-support (EoS), see End-of-support Databricks Runtime release notes. The EoS Databricks Runtime versions have been retired and might not be updated.

DBR 17.0 (Beta) new and updated features

SQL procedure support

SQL scripts can now be encapsulated in a procedure stored as a reusable asset in Unity Catalog. You can create a procedure using the CREATE PROCEDURE command, and then call it using the CALL command.

Set a default collation for SQL Functions

Using the new DEFAULT COLLATION clause in the CREATE FUNCTION command defines the default collation used for STRING parameters, the return type, and STRING literals in the function body.

Recursive common table expressions (rCTE) support

Databricks now supports navigation of hierarchical data using recursive common table expressions (rCTEs). Use a self-referencing CTE with UNION ALL to follow the recursive relationship.

ANSI SQL enabled by default

The default SQL dialect is now ANSI SQL. ANSI SQL is a well-established standard and will help protect users from unexpected or incorrect results. Read the Databricks ANSI enablement guide for more information.

PySpark and Spark Connect now support the DataFrames df.mergeInto API

PySpark and Spark Connect now support the df.mergeInto API, which was previously only available for Scala.

Support ALL CATALOGS in SHOW SCHEMAS

The SHOW SCHEMAS syntax is updated to accept the following syntax:

SHOW SCHEMAS [ { FROM | IN } { catalog_name | ALL CATALOGS } ] [ [ LIKE ] pattern ]

When ALL CATALOGS is specified in a a SHOW query, the execution iterates through all active catalogs that support namespaces using the catalog manager (DsV2). For each catalog, it includes the top-level namespaces.

The output attributes and schema of the command have been modified to add a catalog column indicating the catalog of the corresponding namespace. The new column is added to the end of the output attributes, as shown below:

Previous output

| Namespace        |
|------------------|
| test-namespace-1 |
| test-namespace-2 |

New output

| Namespace        | Catalog        |
|------------------|----------------|
| test-namespace-1 | test-catalog-1 |
| test-namespace-2 | test-catalog-2 |

Liquid clustering now compacts deletion vectors more efficiently

Delta tables with Liquid clustering now apply physical changes from deletion vectors more efficiently when OPTIMIZE is running. For more details, see Apply changes to Parquet data files.

Allow non-deterministic expressions in UPDATE/INSERT column values for MERGE operations

Databricks now allows the use of non-deterministic expressions in updated and inserted column values of MERGE operations. However, non-deterministic expressions in the conditions of MERGE statements are not supported.

For example, you can now generate dynamic or random values for columns:

MERGE INTO target USING source
ON target.key = source.key
WHEN MATCHED THEN UPDATE SET target.value = source.value + rand()

This can be helpful for data privacy to obfuscate actual data while preserving the data properties (such as mean values or other computed columns).

Ignore and rescue empty structs for AutoLoader ingestion (especially Avro)

Auto Loader now rescues Avro data types with an empty schema since Delta table does not support ingestiom of empty struct-type data.

Change Delta MERGE Python and Scala APIs to return DataFrame instead of Unit

The Scala and Python MERGE APIs (such as DeltaMergeBuilder) now also return a DataFrame like the SQL API does, with the same results.

Behavioral changes

Behavioral change for the Auto Loader incremental directory listing option

The value of the deprecated Auto Loader cloudFiles.useIncrementalListing option is now set to a default value of false . As a result, this change causes Auto Loader to perform a full directory listing each time it's run. Previously, the default value of the cloudFiles.useIncrementalListing option was auto, instructing Auto Loader to make a best-effort attempt at detecting if an incremental listing can be used with a directory.

Databricks recommends against using this option. Instead, use file notification mode with file events. If you want to continue to use the incremental listing feature, set cloudFiles.useIncrementalListing to auto in your code. When you set this value to auto, Auto Loader makes a best-effort attempt to do a full listing once every seven incremental listings, which matches the behavior of this option before this change.

To learn more about Auto Loader directory listing, see Auto Loader streams with directory listing mode.

Removed the "True cache misses" section in Spark UI

This changes removes support for the "Cache true misses size" metric (for both compressed and uncompressed caches). The "Cache writes misses" metric measures the same information.

Use the numLocalScanTasks as a viable proxy for this metric, when your intention is to see how the cache performs when files are assigned to the right executor.

Removed the "Cache Metadata Manager Peak Disk Usage" metric in the Spark UI

This change removes support for the cacheLocalityMgrDiskUsageInBytes and cacheLocalityMgrTimeMs metrics from the Databricks Runtime and the Spark UI.

Removed the "Rescheduled cache miss bytes" section in the Spark UI

Removed the cache rescheduled misses size and cache rescheduled misses size (uncompressed) metrics from DBR. This is done because this measures how the cache performs when files are assigned to non-preferred executors. numNonLocalScanTasks is a good proxy for this metric.

CREATE VIEW column-level clauses now throw errors when the clause would only apply to materialized views

CREATE VIEW commands which specify a column-level clause that is only valid for MATERIALIZED VIEWs now throw an error. The affected clauses for CREATE VIEW commands are:

  • NOT NULL
  • A specified datatype, such as FLOAT or STRING
  • DEFAULT
  • COLUMN MASK

Library upgrades

Apache Spark

Many of its features were already available in Databricks Runtime 14.x, 15.x and 16.x, and now they ship out of the box with Runtime 17.0.

Core and Spark SQL highlights

Spark Core

Spark SQL

Features

Functions

Query optimization

  • [SPARK-46946] Supporting broadcast of multiple filtering keys in DynamicPruning
  • [SPARK-48445] Don’t inline UDFs with expansive children
  • [SPARK-41413] Avoid shuffle in Storage-Partitioned Join when partition keys mismatch, but expressions are compatible
  • [SPARK-46941] Prevent insertion of window group limit node with SizeBasedWindowFunction
  • [SPARK-46707] Add throwable field to expressions to improve predicate pushdown
  • [SPARK-47511] Canonicalize WITH expressions by reassigning IDs
  • [SPARK-46502] Support timestamp types in UnwrapCastInBinaryComparison
  • [SPARK-46069] Support unwrap timestamp type to date type
  • [SPARK-46219] Unwrap cast in join predicates
  • [SPARK-45606] Release restrictions on multi-layer runtime filter
  • [SPARK-45909] Remove NumericType cast if it can safely up-cast in IsNotNull

Query execution

  • [SPARK-45592][SPARK-45282] Correctness issue in AQE with InMemoryTableScanExec
  • [SPARK-50258] Fix output column order changed issue after AQE
  • [SPARK-46693] Inject LocalLimitExec when matching OffsetAndLimit or LimitAndOffset
  • [SPARK-48873] Use UnsafeRow in JSON parser
  • [SPARK-41471] Reduce Spark shuffle when only one side of a join is KeyGroupedPartitioning
  • [SPARK-45452] Improve InMemoryFileIndex to use FileSystem.listFiles API
  • [SPARK-48649] Add ignoreInvalidPartitionPaths configs for skipping invalid partition paths
  • [SPARK-45882] BroadcastHashJoinExec propagate partitioning should respect CoalescedHashPartitioning

Spark Connectors

DS v2 framework support changes

Hive Catalog support changes

XML support changes

CSV support changes

  • [SPARK-46862] Disable CSV column pruning in multi-line mode
  • [SPARK-46890] Fix CSV parsing bug with default values and column pruning
  • [SPARK-50616] Add File Extension Option to CSV DataSource Writer
  • [SPARK-49125] Allow duplicated column names in CSV writing
  • [SPARK-49016] Restore behavior for queries from raw CSV files
  • [SPARK-48807] Binary support for CSV datasource
  • [SPARK-48602] Make csv generator support different output style via spark.sql.binaryOutputStyle

ORC support changes

Avro support changes

JDBC changes

Other notable changes

  • [SPARK-45905] Least common type between decimal types should retain integral digits first
  • [SPARK-45786] Fix inaccurate Decimal multiplication and division results
  • [SPARK-50705] Make QueryPlan lock‑free
  • [SPARK-46743] Fix corner-case with COUNT + constant folding subquery
  • [SPARK-47509] Block subquery expressions in lambda/higher-order functions for correctness
  • [SPARK-48498] Always do char padding in predicates
  • [SPARK-45915] Treat decimal(x, 0) the same as IntegralType in PromoteStrings
  • [SPARK-46220] Restrict charsets in decode()
  • [SPARK-45816] Return NULL when overflowing during casting from timestamp to integers
  • [SPARK-45586] Reduce compiler latency for plans with large expression trees
  • [SPARK-45507] Correctness fix for nested correlated scalar subqueries with COUNT aggregates
  • [SPARK-44550] Enable correctness fixes for null IN (empty list) under ANSI
  • [SPARK-47911] Introduces a universal BinaryFormatter to make binary output consistent

PySpark

Below are the changes and improvements made to the PySpark libraries shipping in Databricks Runtime 17.0 (Beta).

Highlights

DataFrame APIs features

  • [SPARK-51079] Support large variable types in pandas UDF, createDataFrame and toPandas with Arrow
  • [SPARK-50718] Support addArtifact(s) for PySpark
  • [SPARK-50778] Add metadataColumn to PySpark DataFrame
  • [SPARK-50719] Support interruptOperation for PySpark
  • [SPARK-50790] Implement parse_json in PySpark
  • [SPARK-49306] Create SQL function aliases for zeroifnull and nullifzero
  • [SPARK-50132] Add DataFrame API for Lateral Joins
  • [SPARK-43295] Support string type columns for DataFrameGroupBy.sum
  • [SPARK-45575] Support time travel options for df.read API
  • [SPARK-45755] Improve Dataset.isEmpty() by applying global limit 1
    • Improves performance of isEmpty() by pushing down a global limit of 1.
  • [SPARK-48761] Introduce clusterBy DataFrameWriter API for Scala
  • [SPARK-45929] Support groupingSets operation in DataFrame API
    • Extends groupingSets(...) to DataFrame/DS-level APIs.
  • [SPARK-40178] Support coalesce hints with ease for PySpark and R

Pandas API on Spark features

Other notable PySpark changes

Spark Streaming

Below are the changes and improvements made to Spark Streaming in Databricks Runtime 17.0 (Beta).

Highlights

Other notable streaming changes

  • [SPARK-44865] Make StreamingRelationV2 support metadata column
  • [SPARK-45080] Explicitly call out support for columnar in DSv2 streaming data sources
  • [SPARK-45178] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources
  • [SPARK-45415] Allow selective disabling of "fallocate" in RocksDB statestore
  • [SPARK-45503] Add Conf to Set RocksDB Compression
  • [SPARK-45511] State Data Source - Reader
  • [SPARK-45558] Introduce a metadata file for streaming stateful operator
  • [SPARK-45794] Introduce state metadata source to query the streaming state metadata information
  • [SPARK-45815] Provide an interface for other Streaming sources to add _metadata columns
  • [SPARK-45845] Add number of evicted state rows to streaming UI
  • [SPARK-46641] Add maxBytesPerTrigger threshold
  • [SPARK-46816] Add base support for new arbitrary state management operator (multiple state variables/column families)
  • [SPARK-46865] Add Batch Support for TransformWithState Operator
  • [SPARK-46906] Add a check for stateful operator change for streaming
  • [SPARK-46961] Use ProcessorContext to store and retrieve handle
  • [SPARK-46962] Add interface for Python streaming data source & worker
  • [SPARK-47107] Partition reader for Python streaming data sources
  • [SPARK-47273] Python data stream writer interface
  • [SPARK-47553] Add Java support for transformWithState operator APIs
  • [SPARK-47653] Add support for negative numeric types and range scan key encoder
  • [SPARK-47733] Add custom metrics for transformWithState operator part of query progress
  • [SPARK-47960] Allow chaining other stateful operators after transformWithState
  • [SPARK-48447] Check StateStoreProvider class before constructor
  • [SPARK-48569] Handle edge cases in query.name for streaming queries
  • [SPARK-48589] Add snapshotStartBatchId / snapshotPartitionId for state data source (see SQL)
  • [SPARK-48589] Add snapshotStartBatchId / snapshotPartitionId options to state data source
  • [SPARK-48726] Create StateSchemaV3 file for TransformWithStateExec
  • [SPARK-48742] Virtual Column Family for RocksDB (arbitrary stateful API v2)
  • [SPARK-48755] transformWithState pyspark base implementation and ValueState support
  • [SPARK-48772] State Data Source Change Feed Reader Mode
  • [SPARK-48836] Integrate SQL schema with state schema/metadata for TWS operator
  • [SPARK-48849] Create OperatorStateMetadataV2 for TransformWithStateExec operator
  • [SPARK-48901][SPARK-48916] Introduce clusterBy DataStreamWriter API in Scala/PySpark
  • [SPARK-48931] Reduce Cloud Store List API cost for state-store maintenance
  • [SPARK-49021] Add support for reading transformWithState value state variables with state data source reader
  • [SPARK-49048] Add support for reading operator metadata at given batch id
  • [SPARK-49191] Read transformWithState map state with state data source
  • [SPARK-49259] Size-based partition creation during Kafka read
  • [SPARK-49411] Communicate State Store Checkpoint ID
  • [SPARK-49463] ListState support in TransformWithStateInPandas
  • [SPARK-49467] Add state data source reader for list state
  • [SPARK-49513] Add timer support in transformWithStateInPandas
  • [SPARK-49630] Add flatten option for collection types in state data source reader
  • [SPARK-49656] Support state variables with value state collection types
  • [SPARK-49676] Chaining of operators in transformWithStateInPandas
  • [SPARK-49699] Disable PruneFilters for streaming workloads
  • [SPARK-49744] TTL support for ListState in TransformWithStateInPandas
  • [SPARK-49745] Read registered timers in transformWithState
  • [SPARK-49802] Add support for read change feed for map/list types
  • [SPARK-49846] Add numUpdatedStateRows/numRemovedStateRows metrics
  • [SPARK-49883] State Store Checkpoint Structure V2 Integration with RocksDB and RocksDBFileManager
  • [SPARK-50017] Support Avro encoding for TransformWithState operator
  • [SPARK-50035] Explicit handleExpiredTimer function in the stateful processor
  • [SPARK-50128] Add handle APIs using implicit encoders
  • [SPARK-50152] Support handleInitialState with state data source reader
  • [SPARK-50194] Integration of New Timer API and Initial State API
  • [SPARK-50378] Add custom metric for time spent populating initial state
  • [SPARK-50428] Support TransformWithStateInPandas in batch queries
  • [SPARK-50573] Adding State Schema ID to State Rows for schema evolution
  • [SPARK-50714] Enable schema evolution for TransformWithState with Avro encoding

Spark ML

Spark UX

Other notable Spark UX changes

Spark Connect

Below are the changes and improvements made to Spark Connect in Databricks Runtime 17.0 (Beta).

Highlights

  • [SPARK-49248] Scala Client Parity with existing Dataset/DataFrame API
  • [SPARK-48918] Create a unified SQL Scala interface shared by regular SQL and Connect
  • [SPARK-50812] Support pyspark.ml on Connect
  • [SPARK-47908] Parent classes for Spark Connect and Spark Classic

Other Spark Connect changes and improvements

  • [SPARK-41065] Implement DataFrame.freqItems and DataFrame.stat.freqItems
  • [SPARK-41066] Implement DataFrame.sampleBy and DataFrame.stat.sampleBy
  • [SPARK-41067] Implement DataFrame.stat.cov
  • [SPARK-41068] Implement DataFrame.stat.corr
  • [SPARK-41069] Implement DataFrame.approxQuantile and DataFrame.stat.approxQuantile
  • [SPARK-41292][SPARK-41640][SPARK-41641] Implement Window functions
  • [SPARK-41333][SPARK-41737] Implement GroupedData.{min, max, avg, sum}
  • [SPARK-41364] Implement broadcast function
  • [SPARK-41383][SPARK-41692][SPARK-41693] Implement rollup, cube, and pivot
  • [SPARK-41434] Initial LambdaFunction implementation
  • [SPARK-41440] Implement DataFrame.randomSplit
  • [SPARK-41464] Implement DataFrame.to
  • [SPARK-41473] Implement format_number function
  • [SPARK-41503] Implement Partition Transformation Functions
  • [SPARK-41529] Implement SparkSession.stop
  • [SPARK-41534] Setup initial client module for Spark Connect
  • [SPARK-41629] Support for Protocol Extensions in Relation and Expression
  • [SPARK-41663] Implement the rest of Lambda functions
  • [SPARK-41673] Implement Column.astype
  • [SPARK-41690] Agnostic Encoders
  • [SPARK-41707] Implement Catalog API in Spark Connect
  • [SPARK-41710] Implement Column.between
  • [SPARK-41722] Implement 3 missing time window functions
  • [SPARK-41723] Implement sequence function
  • [SPARK-41724] Implement call_udf function
  • [SPARK-41728] Implement unwrap_udt function
  • [SPARK-41731] Implement the column accessor (getItem, getField, getitem, etc.)
  • [SPARK-41738] Mix ClientId in SparkSession cache
  • [SPARK-41740] Implement Column.name
  • [SPARK-41767] Implement Column.{withField, dropFields}
  • [SPARK-41785] Implement GroupedData.mean
  • [SPARK-41803] Add missing function log(arg1, arg2)
  • [SPARK-41810] Infer names from a list of dictionaries in SparkSession.createDataFrame
  • [SPARK-41811] Implement SQLStringFormatter with WithRelations
  • [SPARK-42664] Support bloomFilter function for DataFrameStatFunctions
  • [SPARK-43662] Support merge_asof in Spark Connect
  • [SPARK-43704] Support MultiIndex for to_series() in Spark Connect
  • [SPARK-44625] SparkConnectExecutionManager to track all executions
  • [SPARK-44731] Make TimestampNTZ work with literals in Python Spark Connect
  • [SPARK-44736] Add Dataset.explode to Spark Connect Scala Client
  • [SPARK-44740] Support specifying session_id in SPARK_REMOTE connection string
  • [SPARK-44747] Add missing SparkSession.Builder methods
  • [SPARK-44750] Apply configuration to SparkSession during creation
  • [SPARK-44761] Support DataStreamWriter.foreachBatch(VoidFunction2)
  • [SPARK-44788] Add from_xml and schema_of_xml to pyspark, Spark Connect, and SQL functions
  • [SPARK-44807] Add Dataset.metadataColumn to Scala Client
  • [SPARK-44877] Support python protobuf functions for Spark Connect
  • [SPARK-45000] Implement DataFrame.foreach
  • [SPARK-45001] Implement DataFrame.foreachPartition
  • [SPARK-45088] Make getitem work with duplicated columns
  • [SPARK-45090] DataFrame.{cube, rollup} support column ordinals
  • [SPARK-45091] Function floor/round/bround now accept Column type scale
  • [SPARK-45121] Support Series.empty for Spark Connect
  • [SPARK-45136] Enhance ClosureCleaner with Ammonite support
  • [SPARK-45137] Support map/array parameters in parameterized sql()
  • [SPARK-45143] Make PySpark compatible with PyArrow 13.0.0
  • [SPARK-45190][SPARK-48897] Make from_xml support StructType schema
  • [SPARK-45235] Support map and array parameters by sql()
  • [SPARK-45485] User agent improvements: Use SPARK_CONNECT_USER_AGENT env variable and include environment specific attributes
  • [SPARK-45506] Add ivy URI support to SparkcConnect addArtifact
  • [SPARK-45509] Fix df column reference behavior for Spark Connect
  • [SPARK-45619] Apply the observed metrics to Observation object
  • [SPARK-45680] Release session
  • [SPARK-45733] Support multiple retry policies
  • [SPARK-45770] Introduce plan DataFrameDropColumns for Dataframe.drop
  • [SPARK-45851] Support multiple policies in scala client
  • [SPARK-46039] Upgrade grpcio\* to 1.59.3 for Python 3.12
  • [SPARK-46048] Support DataFrame.groupingSets in Python Spark Connect
  • [SPARK-46085] Dataset.groupingSets in Scala Spark Connect client
  • [SPARK-46202] Expose new ArtifactManager APIs to support custom target directories
  • [SPARK-46229] Add applyInArrow to groupBy and cogroup in Spark Connect
  • [SPARK-46255] Support complex type -> string conversion
  • [SPARK-46620] Introduce a basic fallback mechanism for frame methods
  • [SPARK-46812] Make mapInPandas/mapInArrow support ResourceProfile
  • [SPARK-46919] Upgrade grpcio* and grpc-java to 1.62.x
  • [SPARK-47014] Implement methods dumpPerfProfile and dumpMemoryProfiles of SparkSession
  • [SPARK-47069] Introduce spark.profile.show/.dump for SparkSession-based profiling
  • [SPARK-47081] Support Query Execution Progress
  • [SPARK-47137] Add getAll to spark.conf for feature parity with Scala
  • [SPARK-47233] Client & Server logic for client-side streaming query listener
  • [SPARK-47276] Introduce spark.profile.clear for SparkSession-based profiling
  • [SPARK-47367] Support Python data sources with Spark Connect
  • [SPARK-47543] Infer dict as MapType from Pandas DataFrame (via new config)
  • [SPARK-47545] Dataset.observe for Scala Connect
  • [SPARK-47694] Make max message size configurable on the client side
  • [SPARK-47712] Allow connect plugins to create and process Datasets
  • [SPARK-47812] Support Serialization of SparkSession for ForEachBatch worker
  • [SPARK-47818] Introduce plan cache in SparkConnectPlanner to improve performance of Analyze requests
  • [SPARK-47828] Fix DataFrameWriterV2.overwrite failure due to invalid plan
  • [SPARK-47845] Support Column type in split function for Scala and Python
  • [SPARK-47909] Parent DataFrame class for Spark Connect and Spark Classic
  • [SPARK-48008] Support UDAFs in Spark Connect
  • [SPARK-48048] Added client side listener support for Scala
  • [SPARK-48058][SPARK-43727] UserDefinedFunction.returnType parse the DDL string
  • [SPARK-48112] Expose session in SparkConnectPlanner to plugins
  • [SPARK-48113] Allow Plugins to integrate with Spark Connect
  • [SPARK-48258] Checkpoint and localCheckpoint in Spark Connect
  • [SPARK-48278] Refine the string representation of Cast
  • [SPARK-48310] Cached properties must return copies
  • [SPARK-48336] Implement ps.sql in Spark Connect
  • [SPARK-48370] Checkpoint and localCheckpoint in Scala Spark Connect client
  • [SPARK-48510] Support UDAF toColumn API in Spark Connect
  • [SPARK-48555] Support using Columns as parameters for several functions (array_remove, array_position, etc.)
  • [SPARK-48569] Handle edge cases in query.name for streaming queries
  • [SPARK-48638] Add ExecutionInfo support for DataFrame
  • [SPARK-48639] Add Origin to RelationCommon
  • [SPARK-48648] Make SparkConnectClient.tags properly thread-local
  • [SPARK-48794] DataFrame.mergeInto support for Spark Connect (Scala & Python)
  • [SPARK-48831] Make default column name of cast compatible with Spark Classic
  • [SPARK-48960] Makes spark‑shell work with Spark Connect (–remote support)
  • [SPARK-49025] Make Column implementation agnostic
  • [SPARK-49027] Share Column API between Classic and Connect
  • [SPARK-49028] Create a shared SparkSession
  • [SPARK-49029] Create shared Dataset interface
  • [SPARK-49087] Distinguish UnresolvedFunction calling internal functions
  • [SPARK-49185] Reimplement kde plot with Spark SQL
  • [SPARK-49201] Reimplement hist plot with Spark SQL
  • [SPARK-49249][SPARK-49122] Add addArtifact API to the Spark SQL Core
  • [SPARK-49273] Origin support for Spark Connect Scala client
  • [SPARK-49282] Create a shared SparkSessionBuilder interface
  • [SPARK-49284] Create a shared Catalog interface
  • [SPARK-49413] Create a shared RuntimeConfig interface
  • [SPARK-49416] Add shared DataStreamReader interface
  • [SPARK-49417] Add shared StreamingQueryManager interface
  • [SPARK-49419] Create shared DataFrameStatFunctions
  • [SPARK-49429] Add shared DataStreamWriter interface
  • [SPARK-49526] Support Windows-style paths in ArtifactManager
  • [SPARK-49530] Support kde/density plots
  • [SPARK-49531] Support line plot with plotly backend
  • [SPARK-49595] Fix DataFrame.unpivot and DataFrame.melt in Spark Connect Scala Client
  • [SPARK-49626] Support horizontal/vertical bar plots
  • [SPARK-49907] Support spark.ml on Connect
  • [SPARK-49948] Add “precision” parameter to pandas on Spark box plot
  • [SPARK-50050] Make lit accept str/bool numpy ndarray
  • [SPARK-50054] Support histogram plots
  • [SPARK-50063] Add support for Variant in the Spark Connect Scala client
  • [SPARK-50075] DataFrame APIs for table-valued functions
  • [SPARK-50134][SPARK-50130] Support DataFrame API for SCALAR and EXISTS subqueries in Spark Connect
  • [SPARK-50134][SPARK-50132] Support DataFrame API for Lateral Join in Spark Connect
  • [SPARK-50227] Upgrade buf plugins to v28.3
  • [SPARK-50298] Implement verifySchema parameter of createDataFrame
  • [SPARK-50306] Support Python 3.13 in Spark Connect
  • [SPARK-50373] Prohibit Variant from set operations
  • [SPARK-50544] Implement StructType.toDDL
  • [SPARK-50710] Add support for optional client reconnection to sessions after release
  • [SPARK-50828] Deprecate pyspark.ml.connect
  • [SPARK-46465] Add Column.isNaN in PySpark
    • Adds the Column.isNaN function to PySpark Connect, matching Scala API parity.
  • [SPARK-41440] Implement DataFrame.randomSplit
    • Implements DataFrame.randomSplit for Spark Connect in Python.
  • [SPARK-41434] Initial LambdaFunction implementation
    • Adds basic support for LambdaFunction and an initial exists function in Spark Connect.
  • [SPARK-41464] Implement DataFrame.to
    • Implements DataFrame.to for Spark Connect in Python.
  • [SPARK-41364] Implement broadcast function
    • Implements the broadcast function in Spark Connect Python client.
  • [SPARK-41663] Implement the rest of Lambda functions
    • Completes Lambda function support in Spark Connect Python client (such as filter, map, etc.).
  • [SPARK-41673] Implement Column.astype
    • Adds Column.astype to Spark Connect Python for type casting.
  • [SPARK-41292][SPARK-41640][SPARK-41641] Implement Window functions
    • Adds support for window functions (Window.partitionBy, Window.orderBy, etc.) to Spark Connect.
  • [SPARK-41534] Setup initial client module for Spark Connect
    • Sets up the initial Scala/JVM client module for Spark Connect.
  • [SPARK-41503] Implement Partition Transformation Functions
    • Implements partition transformation functions for Spark Connect in Python.
  • [SPARK-41710] Implement Column.between
    • Adds Column.between method to Spark Connect in Python.
  • [SPARK-41707] Implement Catalog API in Spark Connect
    • Implements the catalog API for Spark Connect (such as listTables, listFunctions, etc.).
  • [SPARK-41690] Agnostic Encoders
    • Introduces “agnostic encoders” for mapping external types to Spark data types.
  • [SPARK-41722] Implement 3 missing time window functions
    • Implements window, window_time, and session_window in Spark Connect Python.
  • [SPARK-41723] Implement sequence function
    • Adds the sequence function for Spark Connect in Python.
  • [SPARK-41473] Implement format_number function
    • Implements format_number function in Spark Connect Python.
  • [SPARK-41724] Implement call_udf function
    • Allows users to call a UDF by name: call_udf("my_udf", col1, col2, ...).
  • [SPARK-41529] Implement SparkSession.stop
    • Implements SparkSession.stop to shut down a Spark Connect session server side.
  • [SPARK-41728] Implement unwrap_udt function
    • Adds the unwrap_udt function to Spark Connect in Python.
  • [SPARK-41731] Implement the column accessor (getItem, getField, getitem, etc.)
    • Allows indexing into arrays and structs in Spark Connect columns.
  • [SPARK-41740] Implement Column.name
    • Adds .name method for columns in Spark Connect Python.
  • [SPARK-41738] Mix ClientId in SparkSession cache
    • Fixes concurrency by mixing client ID into the SparkSession cache on the server.
  • [SPARK-41067] Implement DataFrame.stat.cov
    • Implements covariance calculation (df.stat.cov) for Spark Connect in Python.
  • [SPARK-41767] Implement Column.{withField, dropFields}
    • Adds support for adding/dropping struct fields in Spark Connect columns.
  • [SPARK-41292] Support Window in pyspark.sql.window namespace
    • Integrates Spark Connect’s window functionality into pyspark.sql.window.
  • [SPARK-41068] Implement DataFrame.stat.corr
    • Implements correlation calculation (df.stat.corr) for Spark Connect in Python.
  • [SPARK-41629] Support for Protocol Extensions in Relation and Expression
    • Adds plugin-based extension mechanism for custom Relation/Expression in Spark Connect.
  • [SPARK-41785] Implement GroupedData.mean
    • Adds the mean function to grouped data in Spark Connect.
  • [SPARK-41069] Implement DataFrame.approxQuantile and DataFrame.stat.approxQuantile
    • Adds approxQuantile for Spark Connect DataFrame/stat in Python.
  • [SPARK-41065] Implement DataFrame.freqItems and DataFrame.stat.freqItems
    • Adds freqItems to Spark Connect DataFrame in Python.
  • [SPARK-41066] Implement DataFrame.sampleBy and DataFrame.stat.sampleBy
    • Adds sampleBy to Spark Connect DataFrame in Python.
  • [SPARK-41810] Infer names from a list of dictionaries in SparkSession.createDataFrame
    • Improves column name inference when creating DataFrames from lists of dictionaries in Spark Connect.
  • [SPARK-41803] Add missing function log(arg1, arg2)
    • Implements two-argument log(base, expr) in Spark Connect Python.
  • [SPARK-41383][SPARK-41692][SPARK-41693] Implement rollup, cube, and pivot
    • Adds DataFrame.rollup, DataFrame.cube, and pivot to Spark Connect.
  • [SPARK-41333][SPARK-41737] Implement GroupedData.{min, max, avg, sum}
    • Implements the standard aggregate functions on grouped data for Spark Connect.
  • [SPARK-45680] Release session
    • Introduces ReleaseSession RPC to cancel all running jobs and remove the session server side.
  • [SPARK-45851] Support multiple policies in scala client
    • Adds multiple retry policies to the Scala Spark Connect client.
  • [SPARK-45990][SPARK-45987] Upgrade protobuf to 4.25.1 for Python 3.11 support
    • Updates protobuf library to fix issues under Python 3.11.
  • [SPARK-46202] Expose new ArtifactManager APIs to support custom target directories
    • Allows adding artifacts with a custom directory structure to remote Spark Connect sessions.
  • [SPARK-46284] Add session_user function to Python
    • Exposes the session_user function in PySpark for Connect, matching Scala parity.
  • [SPARK-46039] Upgrade grpcio\* to 1.59.3 for Python 3.12
    • Updates gRPC libraries to support Python 3.12 and new grpc-inprocess.
  • [SPARK-46048] Support DataFrame.groupingSets in Python Spark Connect
    • Allows calling df.groupingSets(...) in Python Spark Connect for multi-dimensional grouping.
  • [SPARK-46085] Dataset.groupingSets in Scala Spark Connect client
    • Adds groupingSets(...) to Spark Connect in Scala.
  • [SPARK-46229] Add applyInArrow to groupBy and cogroup in Spark Connect
    • Implements applyInArrow in Spark Connect for grouped/cogrouped DataFrame operations.
  • [SPARK-46255] Support complex type -> string conversion
    • Allows string conversion of complex (list/struct) types in Spark Connect Python.
  • [SPARK-45770] Introduce plan DataFrameDropColumns for Dataframe.drop
  • [SPARK-45733] Support multiple retry policies
  • [SPARK-45485] User agent improvements: Use SPARK_CONNECT_USER_AGENT env variable and include environment specific attributes
  • [SPARK-44753] XML: pyspark SQL XML reader/writer
  • [SPARK-45619] Apply the observed metrics to Observation object
  • [SPARK-45088] Make getitem work with duplicated columns
  • [SPARK-45091] Function floor/round/bround now accept Column type scale
  • [SPARK-45143] Make PySpark compatible with PyArrow 13.0.0
  • [SPARK-44788] Add from_xml and schema_of_xml to pyspark, Spark Connect, and SQL functions
  • [SPARK-45137] Support map/array parameters in parameterized sql()
  • [SPARK-45235] Support map and array parameters by sql()
  • [SPARK-43662] Support merge_asof in Spark Connect
  • [SPARK-45121] Support Series.empty for Spark Connect
  • [SPARK-45090] DataFrame.{cube, rollup} support column ordinals
  • [SPARK-45136] Enhance ClosureCleaner with Ammonite support
  • [SPARK-45506] Add ivy URI support to SparkcConnect addArtifact
  • [SPARK-43704] Support MultiIndex for to_series() in Spark Connect
  • [SPARK-44807] Add Dataset.metadataColumn to Scala Client
  • [SPARK-44877] Support python protobuf functions for Spark Connect
  • [SPARK-44750] Apply configuration to SparkSession during creation
  • [SPARK-45000] Implement DataFrame.foreach
  • [SPARK-45001] Implement DataFrame.foreachPartition
  • [SPARK-44740] Support specifying session_id in SPARK_REMOTE connection string
  • [SPARK-44747] Add missing SparkSession.Builder methods
  • [SPARK-44731] Make TimestampNTZ work with literals in Python Spark Connect
  • [SPARK-44761] Support DataStreamWriter.foreachBatch(VoidFunction2)
  • [SPARK-44625] SparkConnectExecutionManager to track all executions
  • [SPARK-44736] Add Dataset.explode to Spark Connect Scala Client
  • [SPARK-42664] Support bloomFilter function for DataFrameStatFunctions
  • [SPARK-48831] Align default cast column name with Spark Classic (Connect)
  • [SPARK-48272] timestamp_diff function added (Connect duplicate above)
  • [SPARK-48369] timestamp_add function added (Connect duplicate above)
  • [SPARK-48336] ps.sql in Spark Connect (duplicate)
  • [SPARK-48370] Checkpoint in Scala Connect client (duplicate above)
  • [SPARK-47545] Dataset.observe for Scala Connect (duplicate)
  • [SPARK-45509] Fix df column reference behavior for Spark Connect Aligns column resolution in Spark Connect with classic Spark and provides better error messages.

System environment

  • Operating System: Ubuntu 24.04.2 LTS
  • Java: Zulu17.54+21-CA
  • Scala: 2.13.16
  • Python: 3.12.3
  • R: 4.4.2
  • Delta Lake: 3.3.1

Installed Python libraries

Library

Version

Library

Version

Library

Version

annotated-types

0.7.0

anyio

4.6.2

argon2-cffi

21.3.0

argon2-cffi-bindings

21.2.0

arrow

1.3.0

asttokens

2.0.5

astunparse

1.6.3

async-lru

2.0.4

attrs

24.3.0

autocommand

2.2.2

azure-common

1.1.28

azure-core

1.34.0

azure-identity

1.20.0

azure-mgmt-core

1.5.0

azure-mgmt-web

8.0.0

azure-storage-blob

12.23.0

azure-storage-file-datalake

12.17.0

babel

2.16.0

backports.tarfile

1.2.0

beautifulsoup4

4.12.3

black

24.10.0

bleach

6.2.0

blinker

1.7.0

boto3

1.36.2

botocore

1.36.3

cachetools

5.5.1

certifi

2025.1.31

cffi

1.17.1

chardet

4.0.0

charset-normalizer

3.3.2

click

8.1.7

cloudpickle

3.0.0

comm

0.2.1

contourpy

1.3.1

cryptography

43.0.3

cycler

0.11.0

Cython

3.0.12

databricks-sdk

0.49.0

dbus-python

1.3.2

debugpy

1.8.11

decorator

5.1.1

defusedxml

0.7.1

Deprecated

1.2.13

distlib

0.3.9

docstring-to-markdown

0.11

executing

0.8.3

facets-overview

1.1.1

fastapi

0.115.12

fastjsonschema

2.21.1

filelock

3.18.0

fonttools

4.55.3

fqdn

1.5.1

fsspec

2023.5.0

gitdb

4.0.11

GitPython

3.1.43

google-api-core

2.20.0

google-auth

2.40.0

google-cloud-core

2.4.3

google-cloud-storage

3.1.0

google-crc32c

1.7.1

google-resumable-media

2.7.2

googleapis-common-protos

1.65.0

grpcio

1.67.0

grpcio-status

1.67.0

h11

0.14.0

httpcore

1.0.2

httplib2

0.20.4

httpx

0.27.0

idna

3.7

importlib-metadata

6.6.0

importlib_resources

6.4.0

inflect

7.3.1

iniconfig

1.1.1

ipyflow-core

0.0.209

ipykernel

6.29.5

ipython

8.30.0

ipython-genutils

0.2.0

ipywidgets

7.8.1

isodate

0.6.1

isoduration

20.11.0

jaraco.context

5.3.0

jaraco.functools

4.0.1

jaraco.text

3.12.1

jedi

0.19.2

Jinja2

3.1.5

jmespath

1.0.1

joblib

1.4.2

json5

0.9.25

jsonpointer

3.0.0

jsonschema

4.23.0

jsonschema-specifications

2023.7.1

jupyter-events

0.10.0

jupyter-lsp

2.2.0

jupyter_client

8.6.3

jupyter_core

5.7.2

jupyter_server

2.14.1

jupyter_server_terminals

0.4.4

jupyterlab

4.3.4

jupyterlab-pygments

0.1.2

jupyterlab-widgets

1.0.0

jupyterlab_server

2.27.3

kiwisolver

1.4.8

launchpadlib

1.11.0

lazr.restfulclient

0.14.6

lazr.uri

1.0.6

markdown-it-py

2.2.0

MarkupSafe

3.0.2

matplotlib

3.10.0

matplotlib-inline

0.1.7

mccabe

0.7.0

mdurl

0.1.0

mistune

2.0.4

mlflow-skinny

2.22.0

mmh3

5.1.0

more-itertools

10.3.0

msal

1.32.3

msal-extensions

1.3.1

mypy-extensions

1.0.0

nbclient

0.8.0

nbconvert

7.16.4

nbformat

5.10.4

nest-asyncio

1.6.0

nodeenv

1.9.1

notebook

7.3.2

notebook_shim

0.2.3

numpy

2.1.3

oauthlib

3.2.2

opentelemetry-api

1.32.1

opentelemetry-sdk

1.32.1

opentelemetry-semantic-conventions

0.53b1

overrides

7.4.0

packaging

24.1

pandas

2.2.3

pandocfilters

1.5.0

parso

0.8.4

pathspec

0.10.3

patsy

1.0.1

pexpect

4.8.0

pillow

11.1.0

pip

24.2

platformdirs

3.10.0

plotly

5.24.1

pluggy

1.5.0

prometheus_client

0.21.0

prompt-toolkit

3.0.43

proto-plus

1.26.1

protobuf

5.29.4

psutil

5.9.0

psycopg2

2.9.3

ptyprocess

0.7.0

pure-eval

0.2.2

pyarrow

19.0.1

pyasn1

0.4.8

pyasn1-modules

0.2.8

pyccolo

0.0.71

pycparser

2.21

pydantic

2.10.6

pydantic_core

2.27.2

pyflakes

3.2.0

Pygments

2.15.1

PyGObject

3.48.2

pyiceberg

0.9.0

PyJWT

2.10.1

pyodbc

5.2.0

pyparsing

3.2.0

pyright

1.1.394

pytest

8.3.5

python-dateutil

2.9.0.post0

python-json-logger

3.2.1

python-lsp-jsonrpc

1.1.2

python-lsp-server

1.12.0

pytoolconfig

1.2.6

pytz

2024.1

PyYAML

6.0.2

pyzmq

26.2.0

referencing

0.30.2

requests

2.32.3

rfc3339-validator

0.1.4

rfc3986-validator

0.1.1

rich

13.9.4

rope

1.12.0

rpds-py

0.22.3

rsa

4.9.1

s3transfer

0.11.3

scikit-learn

1.6.1

scipy

1.15.1

seaborn

0.13.2

Send2Trash

1.8.2

setuptools

74.0.0

six

1.16.0

smmap

5.0.0

sniffio

1.3.0

sortedcontainers

2.4.0

soupsieve

2.5

sqlparse

0.5.3

ssh-import-id

5.11

stack-data

0.2.0

starlette

0.46.2

statsmodels

0.14.4

strictyaml

1.7.3

tenacity

9.0.0

terminado

0.17.1

threadpoolctl

3.5.0

tinycss2

1.4.0

tokenize_rt

6.1.0

tomli

2.0.1

tornado

6.4.2

traitlets

5.14.3

typeguard

4.3.0

types-python-dateutil

2.9.0.20241206

typing_extensions

4.12.2

tzdata

2024.1

ujson

5.10.0

unattended-upgrades

0.1

uri-template

1.3.0

urllib3

2.3.0

uvicorn

0.34.2

virtualenv

20.29.3

wadllib

1.3.6

wcwidth

0.2.5

webcolors

24.11.1

webencodings

0.5.1

websocket-client

1.8.0

whatthepatch

1.0.2

wheel

0.45.1

widgetsnbextension

3.6.6

wrapt

1.17.0

yapf

0.40.2

zipp

3.21.0

Installed R libraries

R libraries are installed from the Posit Package Manager CRAN snapshot on 2025-03-20.

Library

Version

Library

Version

Library

Version

arrow

19.0.1

askpass

1.2.1

assertthat

0.2.1

backports

1.5.0

base

4.4.2

base64enc

0.1-3

bigD

0.3.0

bit

4.6.0

bit64

4.6.0-1

bitops

1.0-9

blob

1.2.4

boot

1.3-30

brew

1.0-10

brio

1.1.5

broom

1.0.7

bslib

0.9.0

cachem

1.1.0

callr

3.7.6

caret

7.0-1

cellranger

1.1.0

chron

2.3-62

class

7.3-22

cli

3.6.4

clipr

0.8.0

clock

0.7.2

cluster

2.1.6

codetools

0.2-20

colorspace

2.1-1

commonmark

1.9.5

compiler

4.4.2

config

0.3.2

conflicted

1.2.0

cpp11

0.5.2

crayon

1.5.3

credentials

2.0.2

curl

6.2.1

data.table

1.17.0

datasets

4.4.2

DBI

1.2.3

dbplyr

2.5.0

desc

1.4.3

devtools

2.4.5

diagram

1.6.5

diffobj

0.3.5

digest

0.6.37

downlit

0.4.4

dplyr

1.1.4

dtplyr

1.3.1

e1071

1.7-16

ellipsis

0.3.2

evaluate

1.0.3

fansi

1.0.6

farver

2.1.2

fastmap

1.2.0

fontawesome

0.5.3

forcats

1.0.0

foreach

1.5.2

foreign

0.8-86

forge

0.2.0

fs

1.6.5

future

1.34.0

future.apply

1.11.3

gargle

1.5.2

generics

0.1.3

gert

2.1.4

ggplot2

3.5.1

gh

1.4.1

git2r

0.35.0

gitcreds

0.1.2

glmnet

4.1-8

globals

0.16.3

glue

1.8.0

googledrive

2.1.1

googlesheets4

1.1.1

gower

1.0.2

graphics

4.4.2

grDevices

4.4.2

grid

4.4.2

gridExtra

2.3

gsubfn

0.7

gt

0.11.1

gtable

0.3.6

hardhat

1.4.1

haven

2.5.4

highr

0.11

hms

1.1.3

htmltools

0.5.8.1

htmlwidgets

1.6.4

httpuv

1.6.15

httr

1.4.7

httr2

1.1.1

ids

1.0.1

ini

0.3.1

ipred

0.9-15

isoband

0.2.7

iterators

1.0.14

jquerylib

0.1.4

jsonlite

1.9.1

juicyjuice

0.1.0

KernSmooth

2.23-22

knitr

1.50

labeling

0.4.3

later

1.4.1

lattice

0.22-5

lava

1.8.1

lifecycle

1.0.4

listenv

0.9.1

lubridate

1.9.4

magrittr

2.0.3

markdown

1.13

MASS

7.3-60.0.1

Matrix

1.6-5

memoise

2.0.1

methods

4.4.2

mgcv

1.9-1

mime

0.13

miniUI

0.1.1.1

mlflow

2.20.4

ModelMetrics

1.2.2.2

modelr

0.1.11

munsell

0.5.1

nlme

3.1-164

nnet

7.3-19

numDeriv

2016.8-1.1

openssl

2.3.2

parallel

4.4.2

parallelly

1.42.0

pillar

1.10.1

pkgbuild

1.4.6

pkgconfig

2.0.3

pkgdown

2.1.1

pkgload

1.4.0

plogr

0.2.0

plyr

1.8.9

praise

1.0.0

prettyunits

1.2.0

pROC

1.18.5

processx

3.8.6

prodlim

2024.06.25

profvis

0.4.0

progress

1.2.3

progressr

0.15.1

promises

1.3.2

proto

1.0.0

proxy

0.4-27

ps

1.9.0

purrr

1.0.4

R6

2.6.1

ragg

1.3.3

randomForest

4.7-1.2

rappdirs

0.3.3

rcmdcheck

1.4.0

RColorBrewer

1.1-3

Rcpp

1.0.14

RcppEigen

0.3.4.0.2

reactable

0.4.4

reactR

0.6.1

readr

2.1.5

readxl

1.4.5

recipes

1.2.0

rematch

2.0.0

rematch2

2.1.2

remotes

2.5.0

reprex

2.1.1

reshape2

1.4.4

rlang

1.1.5

rmarkdown

2.29

RODBC

1.3-26

roxygen2

7.3.2

rpart

4.1.23

rprojroot

2.0.4

Rserve

1.8-15

RSQLite

2.3.9

rstudioapi

0.17.1

rversions

2.1.2

rvest

1.0.4

sass

0.4.9

scales

1.3.0

selectr

0.4-2

sessioninfo

1.2.3

shape

1.4.6.1

shiny

1.10.0

sourcetools

0.1.7-1

sparklyr

1.9.0

SparkR

4.0.0

sparsevctrs

0.3.1

spatial

7.3-17

splines

4.4.2

sqldf

0.4-11

SQUAREM

2021.1

stats

4.4.2

stats4

4.4.2

stringi

1.8.4

stringr

1.5.1

survival

3.5-8

swagger

5.17.14.1

sys

3.4.3

systemfonts

1.2.1

tcltk

4.4.2

testthat

3.2.3

textshaping

1.0.0

tibble

3.2.1

tidyr

1.3.1

tidyselect

1.2.1

tidyverse

2.0.0

timechange

0.3.0

timeDate

4041.110

tinytex

0.56

tools

4.4.2

tzdb

0.5.0

urlchecker

1.0.1

usethis

3.1.0

utf8

1.2.4

utils

4.4.2

uuid

1.2-1

V8

6.0.2

vctrs

0.6.5

viridisLite

0.4.2

vroom

1.6.5

waldo

0.6.1

whisker

0.4.1

withr

3.0.2

xfun

0.51

xml2

1.3.8

xopen

1.0.1

xtable

1.8-4

yaml

2.3.10

zeallot

0.1.0

zip

2.3.2

Installed Java and Scala libraries (Scala 2.13 cluster version)

Group ID

Artifact ID

Version

antlr

antlr

2.7.7

com.amazonaws

amazon-kinesis-client

1.12.0

com.amazonaws

aws-java-sdk-autoscaling

1.12.638

com.amazonaws

aws-java-sdk-cloudformation

1.12.638

com.amazonaws

aws-java-sdk-cloudfront

1.12.638

com.amazonaws

aws-java-sdk-cloudhsm

1.12.638

com.amazonaws

aws-java-sdk-cloudsearch

1.12.638

com.amazonaws

aws-java-sdk-cloudtrail

1.12.638

com.amazonaws

aws-java-sdk-cloudwatch

1.12.638

com.amazonaws

aws-java-sdk-cloudwatchmetrics

1.12.638

com.amazonaws

aws-java-sdk-codedeploy

1.12.638

com.amazonaws

aws-java-sdk-cognitoidentity

1.12.638

com.amazonaws

aws-java-sdk-cognitosync

1.12.638

com.amazonaws

aws-java-sdk-config

1.12.638

com.amazonaws

aws-java-sdk-core

1.12.638

com.amazonaws

aws-java-sdk-datapipeline

1.12.638

com.amazonaws

aws-java-sdk-directconnect

1.12.638

com.amazonaws

aws-java-sdk-directory

1.12.638

com.amazonaws

aws-java-sdk-dynamodb

1.12.638

com.amazonaws

aws-java-sdk-ec2

1.12.638

com.amazonaws

aws-java-sdk-ecs

1.12.638

com.amazonaws

aws-java-sdk-efs

1.12.638

com.amazonaws

aws-java-sdk-elasticache

1.12.638

com.amazonaws

aws-java-sdk-elasticbeanstalk

1.12.638

com.amazonaws

aws-java-sdk-elasticloadbalancing

1.12.638

com.amazonaws

aws-java-sdk-elastictranscoder

1.12.638

com.amazonaws

aws-java-sdk-emr

1.12.638

com.amazonaws

aws-java-sdk-glacier

1.12.638

com.amazonaws

aws-java-sdk-glue

1.12.638

com.amazonaws

aws-java-sdk-iam

1.12.638

com.amazonaws

aws-java-sdk-importexport

1.12.638

com.amazonaws

aws-java-sdk-kinesis

1.12.638

com.amazonaws

aws-java-sdk-kms

1.12.638

com.amazonaws

aws-java-sdk-lambda

1.12.638

com.amazonaws

aws-java-sdk-logs

1.12.638

com.amazonaws

aws-java-sdk-machinelearning

1.12.638

com.amazonaws

aws-java-sdk-opsworks

1.12.638

com.amazonaws

aws-java-sdk-rds

1.12.638

com.amazonaws

aws-java-sdk-redshift

1.12.638

com.amazonaws

aws-java-sdk-route53

1.12.638

com.amazonaws

aws-java-sdk-s3

1.12.638

com.amazonaws

aws-java-sdk-ses

1.12.638

com.amazonaws

aws-java-sdk-simpledb

1.12.638

com.amazonaws

aws-java-sdk-simpleworkflow

1.12.638

com.amazonaws

aws-java-sdk-sns

1.12.638

com.amazonaws

aws-java-sdk-sqs

1.12.638

com.amazonaws

aws-java-sdk-ssm

1.12.638

com.amazonaws

aws-java-sdk-storagegateway

1.12.638

com.amazonaws

aws-java-sdk-sts

1.12.638

com.amazonaws

aws-java-sdk-support

1.12.638

com.amazonaws

aws-java-sdk-swf-libraries

1.11.22

com.amazonaws

aws-java-sdk-workspaces

1.12.638

com.amazonaws

jmespath-java

1.12.638

com.clearspring.analytics

stream

2.9.8

com.databricks

Rserve

1.8-3

com.databricks

databricks-sdk-java

0.27.0

com.databricks

jets3t

0.7.1-0

com.databricks.scalapb

scalapb-runtime_2.12

0.4.15-10

com.esotericsoftware

kryo-shaded

4.0.3

com.esotericsoftware

minlog

1.3.0

com.fasterxml

classmate

1.5.1

com.fasterxml.jackson.core

jackson-annotations

2.18.2

com.fasterxml.jackson.core

jackson-core

2.18.2

com.fasterxml.jackson.core

jackson-databind

2.18.2

com.fasterxml.jackson.dataformat

jackson-dataformat-cbor

2.18.2

com.fasterxml.jackson.dataformat

jackson-dataformat-yaml

2.15.2

com.fasterxml.jackson.datatype

jackson-datatype-joda

2.18.2

com.fasterxml.jackson.datatype

jackson-datatype-jsr310

2.18.2

com.fasterxml.jackson.module

jackson-module-paranamer

2.18.2

com.fasterxml.jackson.module

jackson-module-scala_2.12

2.18.2

com.github.ben-manes.caffeine

caffeine

2.9.3

com.github.blemale

scaffeine_2.12

5.2.1

com.github.fommil

jniloader

1.1

com.github.fommil.netlib

native_ref-java

1.1

com.github.fommil.netlib

native_ref-java

1.1-natives

com.github.fommil.netlib

native_system-java

1.1

com.github.fommil.netlib

native_system-java

1.1-natives

com.github.fommil.netlib

netlib-native_ref-linux-x86_64

1.1-natives

com.github.fommil.netlib

netlib-native_system-linux-x86_64

1.1-natives

com.github.luben

zstd-jni

1.5.6-10

com.github.virtuald

curvesapi

1.08

com.github.wendykierp

JTransforms

3.1

com.google.api.grpc

proto-google-common-protos

2.5.1

com.google.code.findbugs

jsr305

3.0.0

com.google.code.gson

gson

2.11.0

com.google.crypto.tink

tink

1.16.0

com.google.errorprone

error_prone_annotations

2.36.0

com.google.flatbuffers

flatbuffers-java

24.3.25

com.google.guava

failureaccess

1.0.2

com.google.guava

guava

33.4.0-jre

com.google.guava

listenablefuture

9999.0-empty-to-avoid-conflict-with-guava

com.google.j2objc

j2objc-annotations

3.0.0

com.google.protobuf

protobuf-java

3.25.5

com.google.protobuf

protobuf-java-util

3.25.5

com.helger

profiler

1.1.1

com.ibm.icu

icu4j

75.1

com.jcraft

jsch

0.1.55

com.lihaoyi

sourcecode_2.12

0.1.9

com.microsoft.azure

azure-data-lake-store-sdk

2.3.10

com.microsoft.sqlserver

mssql-jdbc

12.8.0.jre11

com.microsoft.sqlserver

mssql-jdbc

12.8.0.jre8

com.ning

compress-lzf

1.1.2

com.sun.mail

javax.mail

1.5.2

com.sun.xml.bind

jaxb-core

2.2.11

com.sun.xml.bind

jaxb-impl

2.2.11

com.tdunning

json

1.8

com.thoughtworks.paranamer

paranamer

2.8

com.trueaccord.lenses

lenses_2.12

0.4.12

com.twitter

chill-java

0.10.0

com.twitter

chill_2.12

0.10.0

com.twitter

util-app_2.12

7.1.0

com.twitter

util-core_2.12

7.1.0

com.twitter

util-function_2.12

7.1.0

com.twitter

util-jvm_2.12

7.1.0

com.twitter

util-lint_2.12

7.1.0

com.twitter

util-registry_2.12

7.1.0

com.twitter

util-stats_2.12

7.1.0

com.typesafe

config

1.4.3

com.typesafe.scala-logging

scala-logging_2.12

3.7.2

com.uber

h3

3.7.3

com.univocity

univocity-parsers

2.9.1

com.zaxxer

HikariCP

4.0.3

com.zaxxer

SparseBitSet

1.3

commons-cli

commons-cli

1.9.0

commons-codec

commons-codec

1.17.2

commons-collections

commons-collections

3.2.2

commons-dbcp

commons-dbcp

1.4

commons-fileupload

commons-fileupload

1.5

commons-httpclient

commons-httpclient

3.1

commons-io

commons-io

2.18.0

commons-lang

commons-lang

2.6

commons-logging

commons-logging

1.1.3

commons-pool

commons-pool

1.5.4

dev.ludovic.netlib

arpack

3.0.3

dev.ludovic.netlib

blas

3.0.3

dev.ludovic.netlib

lapack

3.0.3

info.ganglia.gmetric4j

gmetric4j

1.0.10

io.airlift

aircompressor

2.0.2

io.delta

delta-sharing-client_2.12

1.3.0

io.dropwizard.metrics

metrics-annotation

4.2.30

io.dropwizard.metrics

metrics-core

4.2.30

io.dropwizard.metrics

metrics-graphite

4.2.30

io.dropwizard.metrics

metrics-healthchecks

4.2.30

io.dropwizard.metrics

metrics-jetty9

4.2.30

io.dropwizard.metrics

metrics-jmx

4.2.30

io.dropwizard.metrics

metrics-json

4.2.30

io.dropwizard.metrics

metrics-jvm

4.2.30

io.dropwizard.metrics

metrics-servlets

4.2.30

io.netty

netty-all

4.1.118.Final

io.netty

netty-buffer

4.1.118.Final

io.netty

netty-codec

4.1.118.Final

io.netty

netty-codec-http

4.1.118.Final

io.netty

netty-codec-http2

4.1.118.Final

io.netty

netty-codec-socks

4.1.118.Final

io.netty

netty-common

4.1.118.Final

io.netty

netty-handler

4.1.118.Final

io.netty

netty-handler-proxy

4.1.118.Final

io.netty

netty-resolver

4.1.118.Final

io.netty

netty-tcnative-boringssl-static

2.0.70.Final-db-r0-linux-aarch_64

io.netty

netty-tcnative-boringssl-static

2.0.70.Final-db-r0-linux-x86_64

io.netty

netty-tcnative-boringssl-static

2.0.70.Final-db-r0-osx-aarch_64

io.netty

netty-tcnative-boringssl-static

2.0.70.Final-db-r0-osx-x86_64

io.netty

netty-tcnative-boringssl-static

2.0.70.Final-db-r0-windows-x86_64

io.netty

netty-tcnative-classes

2.0.70.Final

io.netty

netty-transport

4.1.118.Final

io.netty

netty-transport-classes-epoll

4.1.118.Final

io.netty

netty-transport-classes-kqueue

4.1.118.Final

io.netty

netty-transport-native-epoll

4.1.118.Final

io.netty

netty-transport-native-epoll

4.1.118.Final-linux-aarch_64

io.netty

netty-transport-native-epoll

4.1.118.Final-linux-riscv64

io.netty

netty-transport-native-epoll

4.1.118.Final-linux-x86_64

io.netty

netty-transport-native-kqueue

4.1.118.Final-osx-aarch_64

io.netty

netty-transport-native-kqueue

4.1.118.Final-osx-x86_64

io.netty

netty-transport-native-unix-common

4.1.118.Final

io.prometheus

simpleclient

0.16.1-databricks

io.prometheus

simpleclient_common

0.16.1-databricks

io.prometheus

simpleclient_dropwizard

0.16.1-databricks

io.prometheus

simpleclient_pushgateway

0.16.1-databricks

io.prometheus

simpleclient_servlet

0.16.1-databricks

io.prometheus

simpleclient_servlet_common

0.16.1-databricks

io.prometheus

simpleclient_tracer_common

0.16.1-databricks

io.prometheus

simpleclient_tracer_otel

0.16.1-databricks

io.prometheus

simpleclient_tracer_otel_agent

0.16.1-databricks

io.prometheus.jmx

collector

0.18.0

jakarta.annotation

jakarta.annotation-api

1.3.5

jakarta.servlet

jakarta.servlet-api

4.0.3

jakarta.validation

jakarta.validation-api

2.0.2

jakarta.ws.rs

jakarta.ws.rs-api

2.1.6

javax.activation

activation

1.1.1

javax.annotation

javax.annotation-api

1.3.2

javax.el

javax.el-api

2.2.4

javax.jdo

jdo-api

3.0.1

javax.transaction

jta

1.1

javax.transaction

transaction-api

1.1

javax.xml.bind

jaxb-api

2.2.11

javolution

javolution

5.5.1

jline

jline

2.14.6

joda-time

joda-time

2.13.0

net.java.dev.jna

jna

5.8.0

net.razorvine

pickle

1.5

net.sf.jpam

jpam

1.1

net.sf.opencsv

opencsv

2.3

net.sf.supercsv

super-csv

2.2.0

net.snowflake

snowflake-ingest-sdk

0.9.6

net.sourceforge.f2j

arpack_combined_all

0.1

org.acplt.remotetea

remotetea-oncrpc

1.1.2

org.antlr

ST4

4.0.4

org.antlr

antlr-runtime

3.5.2

org.antlr

antlr4-runtime

4.13.1

org.antlr

stringtemplate

3.2.1

org.apache.ant

ant

1.10.11

org.apache.ant

ant-jsch

1.10.11

org.apache.ant

ant-launcher

1.10.11

org.apache.arrow

arrow-format

18.2.0

org.apache.arrow

arrow-memory-core

18.2.0

org.apache.arrow

arrow-memory-netty

18.2.0

org.apache.arrow

arrow-memory-netty-buffer-patch

18.2.0

org.apache.arrow

arrow-vector

18.2.0

org.apache.avro

avro

1.12.0

org.apache.avro

avro-ipc

1.12.0

org.apache.avro

avro-mapred

1.12.0

org.apache.commons

commons-collections4

4.4

org.apache.commons

commons-compress

1.27.1

org.apache.commons

commons-crypto

1.1.0

org.apache.commons

commons-lang3

3.17.0

org.apache.commons

commons-math3

3.6.1

org.apache.commons

commons-text

1.13.0

org.apache.curator

curator-client

5.7.1

org.apache.curator

curator-framework

5.7.1

org.apache.curator

curator-recipes

5.7.1

org.apache.datasketches

datasketches-java

6.1.1

org.apache.datasketches

datasketches-memory

3.0.2

org.apache.derby

derby

10.14.2.0

org.apache.hadoop

hadoop-client-runtime

3.4.1

org.apache.hive

hive-beeline

2.3.10

org.apache.hive

hive-cli

2.3.10

org.apache.hive

hive-jdbc

2.3.10

org.apache.hive

hive-llap-client

2.3.10

org.apache.hive

hive-llap-common

2.3.10

org.apache.hive

hive-serde

2.3.10

org.apache.hive

hive-shims

2.3.10

org.apache.hive

hive-storage-api

2.8.1

org.apache.hive.shims

hive-shims-0.23

2.3.10

org.apache.hive.shims

hive-shims-common

2.3.10

org.apache.hive.shims

hive-shims-scheduler

2.3.10

org.apache.httpcomponents

httpclient

4.5.14

org.apache.httpcomponents

httpcore

4.4.16

org.apache.ivy

ivy

2.5.3

org.apache.logging.log4j

log4j-1.2-api

2.24.3

org.apache.logging.log4j

log4j-api

2.24.3

org.apache.logging.log4j

log4j-core

2.24.3

org.apache.logging.log4j

log4j-layout-template-json

2.24.3

org.apache.logging.log4j

log4j-slf4j2-impl

2.24.3

org.apache.orc

orc-core

2.1.1-shaded-protobuf

org.apache.orc

orc-format

1.1.0-shaded-protobuf

org.apache.orc

orc-mapreduce

2.1.1-shaded-protobuf

org.apache.orc

orc-shims

2.1.1

org.apache.poi

poi

5.4.1

org.apache.poi

poi-ooxml

5.4.1

org.apache.poi

poi-ooxml-full

5.4.1

org.apache.poi

poi-ooxml-lite

5.4.1

org.apache.thrift

libfb303

0.9.3

org.apache.thrift

libthrift

0.16.0

org.apache.ws.xmlschema

xmlschema-core

2.3.1

org.apache.xbean

xbean-asm9-shaded

4.26

org.apache.xmlbeans

xmlbeans

5.3.0

org.apache.yetus

audience-annotations

0.13.0

org.apache.zookeeper

zookeeper

3.9.3

org.apache.zookeeper

zookeeper-jute

3.9.3

org.checkerframework

checker-qual

3.43.0

org.codehaus.janino

commons-compiler

3.0.16

org.codehaus.janino

janino

3.0.16

org.datanucleus

datanucleus-api-jdo

4.2.4

org.datanucleus

datanucleus-core

4.1.17

org.datanucleus

datanucleus-rdbms

4.1.19

org.datanucleus

javax.jdo

3.2.0-m3

org.eclipse.jetty

jetty-client

9.4.53.v20231009

org.eclipse.jetty

jetty-continuation

9.4.53.v20231009

org.eclipse.jetty

jetty-http

9.4.53.v20231009

org.eclipse.jetty

jetty-io

9.4.53.v20231009

org.eclipse.jetty

jetty-jndi

9.4.53.v20231009

org.eclipse.jetty

jetty-plus

9.4.53.v20231009

org.eclipse.jetty

jetty-proxy

9.4.53.v20231009

org.eclipse.jetty

jetty-security

9.4.53.v20231009

org.eclipse.jetty

jetty-server

9.4.53.v20231009

org.eclipse.jetty

jetty-servlet

9.4.53.v20231009

org.eclipse.jetty

jetty-servlets

9.4.53.v20231009

org.eclipse.jetty

jetty-util

9.4.53.v20231009

org.eclipse.jetty

jetty-util-ajax

9.4.53.v20231009

org.eclipse.jetty

jetty-webapp

9.4.53.v20231009

org.eclipse.jetty

jetty-xml

9.4.53.v20231009

org.eclipse.jetty.websocket

websocket-api

9.4.53.v20231009

org.eclipse.jetty.websocket

websocket-client

9.4.53.v20231009

org.eclipse.jetty.websocket

websocket-common

9.4.53.v20231009

org.eclipse.jetty.websocket

websocket-server

9.4.53.v20231009

org.eclipse.jetty.websocket

websocket-servlet

9.4.53.v20231009

org.fusesource.leveldbjni

leveldbjni-all

1.8

org.glassfish.hk2

hk2-api

2.6.1

org.glassfish.hk2

hk2-locator

2.6.1

org.glassfish.hk2

hk2-utils

2.6.1

org.glassfish.hk2

osgi-resource-locator

1.0.3

org.glassfish.hk2.external

aopalliance-repackaged

2.6.1

org.glassfish.hk2.external

jakarta.inject

2.6.1

org.glassfish.jersey.containers

jersey-container-servlet

2.41

org.glassfish.jersey.containers

jersey-container-servlet-core

2.41

org.glassfish.jersey.core

jersey-client

2.41

org.glassfish.jersey.core

jersey-common

2.41

org.glassfish.jersey.core

jersey-server

2.41

org.glassfish.jersey.inject

jersey-hk2

2.41

org.hibernate.validator

hibernate-validator

6.2.5.Final

org.ini4j

ini4j

0.5.4

org.javassist

javassist

3.29.2-GA

org.jboss.logging

jboss-logging

3.4.1.Final

org.jdbi

jdbi

2.63.1

org.jetbrains

annotations

17.0.0

org.joda

joda-convert

1.7

org.jodd

jodd-core

3.5.2

org.json4s

json4s-ast_2.12

4.0.7

org.json4s

json4s-core_2.12

4.0.7

org.json4s

json4s-jackson-core_2.12

4.0.7

org.json4s

json4s-jackson_2.12

4.0.7

org.json4s

json4s-scalap_2.12

4.0.7

org.lz4

lz4-java

1.8.0-databricks-1

org.mlflow

mlflow-spark_2.12

2.9.1

org.objenesis

objenesis

3.3

org.postgresql

postgresql

42.6.1

org.roaringbitmap

RoaringBitmap

1.2.1

org.rocksdb

rocksdbjni

9.8.4

org.rosuda.REngine

REngine

2.1.0

org.scala-lang

scala-compiler_2.12

2.12.15

org.scala-lang

scala-library_2.12

2.12.15

org.scala-lang

scala-reflect_2.12

2.12.15

org.scala-lang.modules

scala-collection-compat_2.12

2.11.0

org.scala-lang.modules

scala-java8-compat_2.12

0.9.1

org.scala-lang.modules

scala-parser-combinators_2.12

2.4.0

org.scala-lang.modules

scala-xml_2.12

2.3.0

org.scala-sbt

test-interface

1.0

org.scalacheck

scalacheck_2.12

1.18.0

org.scalactic

scalactic_2.12

3.2.19

org.scalanlp

breeze-macros_2.12

2.1.0

org.scalanlp

breeze_2.12

2.1.0

org.scalatest

scalatest-compatible

3.2.19

org.scalatest

scalatest-core_2.12

3.2.19

org.scalatest

scalatest-diagrams_2.12

3.2.19

org.scalatest

scalatest-featurespec_2.12

3.2.19

org.scalatest

scalatest-flatspec_2.12

3.2.19

org.scalatest

scalatest-freespec_2.12

3.2.19

org.scalatest

scalatest-funspec_2.12

3.2.19

org.scalatest

scalatest-funsuite_2.12

3.2.19

org.scalatest

scalatest-matchers-core_2.12

3.2.19

org.scalatest

scalatest-mustmatchers_2.12

3.2.19

org.scalatest

scalatest-propspec_2.12

3.2.19

org.scalatest

scalatest-refspec_2.12

3.2.19

org.scalatest

scalatest-shouldmatchers_2.12

3.2.19

org.scalatest

scalatest-wordspec_2.12

3.2.19

org.scalatest

scalatest_2.12

3.2.19

org.slf4j

jcl-over-slf4j

2.0.16

org.slf4j

jul-to-slf4j

2.0.16

org.slf4j

slf4j-api

2.0.16

org.slf4j

slf4j-simple

1.7.25

org.threeten

threeten-extra

1.8.0

org.tukaani

xz

1.10

org.typelevel

algebra_2.12

2.0.1

org.typelevel

cats-kernel_2.12

2.1.1

org.typelevel

spire-macros_2.12

0.17.0

org.typelevel

spire-platform_2.12

0.17.0

org.typelevel

spire-util_2.12

0.17.0

org.typelevel

spire_2.12

0.17.0

org.wildfly.openssl

wildfly-openssl

1.1.3.Final

org.xerial

sqlite-jdbc

3.42.0.0

org.xerial.snappy

snappy-java

1.1.10.3

org.yaml

snakeyaml

2.0

oro

oro

2.0.8

pl.edu.icm

JLargeArrays

1.5

software.amazon.cryptools

AmazonCorrettoCryptoProvider

2.4.1-linux-x86_64

stax

stax-api

1.0.1

tip

To see release notes for Databricks Runtime versions that have reached end-of-support (EoS), see End-of-support Databricks Runtime release notes. The EoS Databricks Runtime versions have been retired and might not be updated.