Databricks Runtime 14.0 (EoS)
Note
Support for this Databricks Runtime version has ended. For the end-of-support date, see End-of-support history. For all supported Databricks Runtime versions, see Databricks Runtime release notes versions and compatibility.
The following release notes provide information about Databricks Runtime 14.0, powered by Apache Spark 3.5.0.
Databricks released this version in September 2023.
New features and improvements
Row tracking is GA
Row tracking for Delta Lake is now generally available. See Use row tracking for Delta tables.
Predictive I/O for updates is GA
Predictive I/O for updates is now generally available. See What is predictive I/O?.
Deletion vectors are GA
Deletion vectors are now generally available. See What are deletion vectors?.
Spark 3.5.0 is GA
Apache Spark 3.5.0 is now generally available. See Spark Release 3.5.0.
Public preview for user-defined table functions for Python
User-defined table functions (UDTFs) allow you to register functions that return tables instead of scalar values. See Python user-defined table functions (UDTFs).
Public preview for row-level concurrency
Row-level concurrency reduces conflicts between concurrent write operations by detecting changes at the row-level and automatically resolving competing changes in concurrent writes that update or delete different rows in the same data file. See Write conflicts with row-level concurrency.
Default current working directory has changed
The default current working directory (CWD) for code executed locally is now the directory containing the notebook or script being run. This includes code such as %sh
and Python or R code not using Spark. See What is the default current working directory?.
Known issue with sparklyr
The installed version of the sparklyr
package (version 1.8.1) is not compatible with Databricks Runtime 14.0. To use sparklyr
, install version 1.8.3 or above.
List available Spark versions API update
Enable Photon by setting runtime_engine = PHOTON
, and enable aarch64
by choosing a graviton instance type. Databricks sets the correct Databricks Runtime version. Previously, the Spark version API would return implementation-specific runtimes for each version. See GET
/api/2.0/clusters/spark-versions in the REST API Reference.
Breaking changes
In Databricks Runtime 14.0 and above, clusters with shared access mode use Spark Connect for client-server communication. This includes the following changes.
For more on shared access mode limitations, see Compute access mode limitations for Unity Catalog.
Library upgrades
Upgraded Python libraries:
asttokens from 2.2.1 to 2.0.5
attrs from 21.4.0 to 22.1.0
botocore from 1.27.28 to 1.27.96
certifi from 2022.9.14 to 2022.12.7
cryptography from 37.0.1 to 39.0.1
debugpy from 1.6.0 to 1.6.7
docstring-to-markdown from 0.12 to 0.11
executing from 1.2.0 to 0.8.3
facets-overview from 1.0.3 to 1.1.1
googleapis-common-protos from 1.56.4 to 1.60.0
grpcio from 1.48.1 to 1.48.2
idna from 3.3 to 3.4
ipykernel from 6.17.1 to 6.25.0
ipython from 8.10.0 to 8.14.0
Jinja2 from 2.11.3 to 3.1.2
jsonschema from 4.16.0 to 4.17.3
jupyter_core from 4.11.2 to 5.2.0
kiwisolver from 1.4.2 to 1.4.4
MarkupSafe from 2.0.1 to 2.1.1
matplotlib from 3.5.2 to 3.7.0
nbconvert from 6.4.4 to 6.5.4
nbformat from 5.5.0 to 5.7.0
nest-asyncio from 1.5.5 to 1.5.6
notebook from 6.4.12 to 6.5.2
numpy from 1.21.5 to 1.23.5
packaging from 21.3 to 22.0
pandas from 1.4.4 to 1.5.3
pathspec from 0.9.0 to 0.10.3
patsy from 0.5.2 to 0.5.3
Pillow from 9.2.0 to 9.4.0
pip from 22.2.2 to 22.3.1
protobuf from 3.19.4 to 4.24.0
pytoolconfig from 1.2.2 to 1.2.5
pytz from 2022.1 to 2022.7
s3transfer from 0.6.0 to 0.6.1
seaborn from 0.11.2 to 0.12.2
setuptools from 63.4.1 to 65.6.3
soupsieve from 2.3.1 to 2.3.2.post1
stack-data from 0.6.2 to 0.2.0
statsmodels from 0.13.2 to 0.13.5
terminado from 0.13.1 to 0.17.1
traitlets from 5.1.1 to 5.7.1
typing_extensions from 4.3.0 to 4.4.0
urllib3 from 1.26.11 to 1.26.14
virtualenv from 20.16.3 to 20.16.7
wheel from 0.37.1 to 0.38.4
Upgraded R libraries:
arrow from 10.0.1 to 12.0.1
base from 4.2.2 to 4.3.1
blob from 1.2.3 to 1.2.4
broom from 1.0.3 to 1.0.5
bslib from 0.4.2 to 0.5.0
cachem from 1.0.6 to 1.0.8
caret from 6.0-93 to 6.0-94
chron from 2.3-59 to 2.3-61
class from 7.3-21 to 7.3-22
cli from 3.6.0 to 3.6.1
clock from 0.6.1 to 0.7.0
commonmark from 1.8.1 to 1.9.0
compiler from 4.2.2 to 4.3.1
cpp11 from 0.4.3 to 0.4.4
curl from 5.0.0 to 5.0.1
data.table from 1.14.6 to 1.14.8
datasets from 4.2.2 to 4.3.1
dbplyr from 2.3.0 to 2.3.3
digest from 0.6.31 to 0.6.33
downlit from 0.4.2 to 0.4.3
dplyr from 1.1.0 to 1.1.2
dtplyr from 1.2.2 to 1.3.1
evaluate from 0.20 to 0.21
fastmap from 1.1.0 to 1.1.1
fontawesome from 0.5.0 to 0.5.1
fs from 1.6.1 to 1.6.2
future from 1.31.0 to 1.33.0
future.apply from 1.10.0 to 1.11.0
gargle from 1.3.0 to 1.5.1
ggplot2 from 3.4.0 to 3.4.2
gh from 1.3.1 to 1.4.0
glmnet from 4.1-6 to 4.1-7
googledrive from 2.0.0 to 2.1.1
googlesheets4 from 1.0.1 to 1.1.1
graphics from 4.2.2 to 4.3.1
grDevices from 4.2.2 to 4.3.1
grid from 4.2.2 to 4.3.1
gtable from 0.3.1 to 0.3.3
hardhat from 1.2.0 to 1.3.0
haven from 2.5.1 to 2.5.3
hms from 1.1.2 to 1.1.3
htmltools from 0.5.4 to 0.5.5
htmlwidgets from 1.6.1 to 1.6.2
httpuv from 1.6.8 to 1.6.11
httr from 1.4.4 to 1.4.6
ipred from 0.9-13 to 0.9-14
jsonlite from 1.8.4 to 1.8.7
KernSmooth from 2.23-20 to 2.23-21
knitr from 1.42 to 1.43
later from 1.3.0 to 1.3.1
lattice from 0.20-45 to 0.21-8
lava from 1.7.1 to 1.7.2.1
lubridate from 1.9.1 to 1.9.2
markdown from 1.5 to 1.7
MASS from 7.3-58.2 to 7.3-60
Matrix from 1.5-1 to 1.5-4.1
methods from 4.2.2 to 4.3.1
mgcv from 1.8-41 to 1.8-42
modelr from 0.1.10 to 0.1.11
nnet from 7.3-18 to 7.3-19
openssl from 2.0.5 to 2.0.6
parallel from 4.2.2 to 4.3.1
parallelly from 1.34.0 to 1.36.0
pillar from 1.8.1 to 1.9.0
pkgbuild from 1.4.0 to 1.4.2
pkgload from 1.3.2 to 1.3.2.1
pROC from 1.18.0 to 1.18.4
processx from 3.8.0 to 3.8.2
prodlim from 2019.11.13 to 2023.03.31
profvis from 0.3.7 to 0.3.8
ps from 1.7.2 to 1.7.5
Rcpp from 1.0.10 to 1.0.11
readr from 2.1.3 to 2.1.4
readxl from 1.4.2 to 1.4.3
recipes from 1.0.4 to 1.0.6
rlang from 1.0.6 to 1.1.1
rmarkdown from 2.20 to 2.23
Rserve from 1.8-12 to 1.8-11
RSQLite from 2.2.20 to 2.3.1
rstudioapi from 0.14 to 0.15.0
sass from 0.4.5 to 0.4.6
shiny from 1.7.4 to 1.7.4.1
sparklyr from 1.7.9 to 1.8.1
SparkR from 3.4.1 to 3.5.0
splines from 4.2.2 to 4.3.1
stats from 4.2.2 to 4.3.1
stats4 from 4.2.2 to 4.3.1
survival from 3.5-3 to 3.5-5
sys from 3.4.1 to 3.4.2
tcltk from 4.2.2 to 4.3.1
testthat from 3.1.6 to 3.1.10
tibble from 3.1.8 to 3.2.1
tidyverse from 1.3.2 to 2.0.0
tinytex from 0.44 to 0.45
tools from 4.2.2 to 4.3.1
tzdb from 0.3.0 to 0.4.0
usethis from 2.1.6 to 2.2.2
utils from 4.2.2 to 4.3.1
vctrs from 0.5.2 to 0.6.3
viridisLite from 0.4.1 to 0.4.2
vroom from 1.6.1 to 1.6.3
waldo from 0.4.0 to 0.5.1
xfun from 0.37 to 0.39
xml2 from 1.3.3 to 1.3.5
zip from 2.2.2 to 2.3.0
Upgraded Java libraries:
com.fasterxml.jackson.core.jackson-annotations from 2.14.2 to 2.15.2
com.fasterxml.jackson.core.jackson-core from 2.14.2 to 2.15.2
com.fasterxml.jackson.core.jackson-databind from 2.14.2 to 2.15.2
com.fasterxml.jackson.dataformat.jackson-dataformat-cbor from 2.14.2 to 2.15.2
com.fasterxml.jackson.datatype.jackson-datatype-joda from 2.14.2 to 2.15.2
com.fasterxml.jackson.datatype.jackson-datatype-jsr310 from 2.13.4 to 2.15.1
com.fasterxml.jackson.module.jackson-module-paranamer from 2.14.2 to 2.15.2
com.fasterxml.jackson.module.jackson-module-scala_2.12 from 2.14.2 to 2.15.2
com.github.luben.zstd-jni from 1.5.2-5 to 1.5.5-4
com.google.code.gson.gson from 2.8.9 to 2.10.1
com.google.crypto.tink.tink from 1.7.0 to 1.9.0
commons-codec.commons-codec from 1.15 to 1.16.0
commons-io.commons-io from 2.11.0 to 2.13.0
io.airlift.aircompressor from 0.21 to 0.24
io.dropwizard.metrics.metrics-core from 4.2.10 to 4.2.19
io.dropwizard.metrics.metrics-graphite from 4.2.10 to 4.2.19
io.dropwizard.metrics.metrics-healthchecks from 4.2.10 to 4.2.19
io.dropwizard.metrics.metrics-jetty9 from 4.2.10 to 4.2.19
io.dropwizard.metrics.metrics-jmx from 4.2.10 to 4.2.19
io.dropwizard.metrics.metrics-json from 4.2.10 to 4.2.19
io.dropwizard.metrics.metrics-jvm from 4.2.10 to 4.2.19
io.dropwizard.metrics.metrics-servlets from 4.2.10 to 4.2.19
io.netty.netty-all from 4.1.87.Final to 4.1.93.Final
io.netty.netty-buffer from 4.1.87.Final to 4.1.93.Final
io.netty.netty-codec from 4.1.87.Final to 4.1.93.Final
io.netty.netty-codec-http from 4.1.87.Final to 4.1.93.Final
io.netty.netty-codec-http2 from 4.1.87.Final to 4.1.93.Final
io.netty.netty-codec-socks from 4.1.87.Final to 4.1.93.Final
io.netty.netty-common from 4.1.87.Final to 4.1.93.Final
io.netty.netty-handler from 4.1.87.Final to 4.1.93.Final
io.netty.netty-handler-proxy from 4.1.87.Final to 4.1.93.Final
io.netty.netty-resolver from 4.1.87.Final to 4.1.93.Final
io.netty.netty-transport from 4.1.87.Final to 4.1.93.Final
io.netty.netty-transport-classes-epoll from 4.1.87.Final to 4.1.93.Final
io.netty.netty-transport-classes-kqueue from 4.1.87.Final to 4.1.93.Final
io.netty.netty-transport-native-epoll from 4.1.87.Final-linux-x86_64 to 4.1.93.Final-linux-x86_64
io.netty.netty-transport-native-kqueue from 4.1.87.Final-osx-x86_64 to 4.1.93.Final-osx-x86_64
io.netty.netty-transport-native-unix-common from 4.1.87.Final to 4.1.93.Final
org.apache.arrow.arrow-format from 11.0.0 to 12.0.1
org.apache.arrow.arrow-memory-core from 11.0.0 to 12.0.1
org.apache.arrow.arrow-memory-netty from 11.0.0 to 12.0.1
org.apache.arrow.arrow-vector from 11.0.0 to 12.0.1
org.apache.avro.avro from 1.11.1 to 1.11.2
org.apache.avro.avro-ipc from 1.11.1 to 1.11.2
org.apache.avro.avro-mapred from 1.11.1 to 1.11.2
org.apache.commons.commons-compress from 1.21 to 1.23.0
org.apache.hadoop.hadoop-client-runtime from 3.3.4 to 3.3.6
org.apache.logging.log4j.log4j-1.2-api from 2.19.0 to 2.20.0
org.apache.logging.log4j.log4j-api from 2.19.0 to 2.20.0
org.apache.logging.log4j.log4j-core from 2.19.0 to 2.20.0
org.apache.logging.log4j.log4j-slf4j2-impl from 2.19.0 to 2.20.0
org.apache.orc.orc-core from 1.8.4-shaded-protobuf to 1.9.0-shaded-protobuf
org.apache.orc.orc-mapreduce from 1.8.4-shaded-protobuf to 1.9.0-shaded-protobuf
org.apache.orc.orc-shims from 1.8.4 to 1.9.0
org.apache.xbean.xbean-asm9-shaded from 4.22 to 4.23
org.checkerframework.checker-qual from 3.19.0 to 3.31.0
org.glassfish.jersey.containers.jersey-container-servlet from 2.36 to 2.40
org.glassfish.jersey.containers.jersey-container-servlet-core from 2.36 to 2.40
org.glassfish.jersey.core.jersey-client from 2.36 to 2.40
org.glassfish.jersey.core.jersey-common from 2.36 to 2.40
org.glassfish.jersey.core.jersey-server from 2.36 to 2.40
org.glassfish.jersey.inject.jersey-hk2 from 2.36 to 2.40
org.javassist.javassist from 3.25.0-GA to 3.29.2-GA
org.mariadb.jdbc.mariadb-java-client from 2.7.4 to 2.7.9
org.postgresql.postgresql from 42.3.8 to 42.6.0
org.roaringbitmap.RoaringBitmap from 0.9.39 to 0.9.45
org.roaringbitmap.shims from 0.9.39 to 0.9.45
org.rocksdb.rocksdbjni from 7.8.3 to 8.3.2
org.scala-lang.modules.scala-collection-compat_2.12 from 2.4.3 to 2.9.0
org.slf4j.jcl-over-slf4j from 2.0.6 to 2.0.7
org.slf4j.jul-to-slf4j from 2.0.6 to 2.0.7
org.slf4j.slf4j-api from 2.0.6 to 2.0.7
org.xerial.snappy.snappy-java from 1.1.10.1 to 1.1.10.3
org.yaml.snakeyaml from 1.33 to 2.0
Apache Spark
Databricks Runtime 14.0. This release includes all Spark fixes and improvements included in Databricks Runtime 13.3 LTS, as well as the following additional bug fixes and improvements made to Spark:
[SPARK-45109] [DBRRM-462][SC-142247][SQL][CONNECT] Fix aes_decrypt and ln functions in Connect
[SPARK-44980] [DBRRM-462][SC-141024][PYTHON][CONNECT] Fix inherited namedtuples to work in createDataFrame
[SPARK-44795] [DBRRM-462][SC-139720][CONNECT] CodeGenerator Cache should be classloader specific
[SPARK-44861] [DBRRM-498][SC-140716][CONNECT] jsonignore SparkListenerConnectOperationStarted.planRequest
[SPARK-44794] [DBRRM-462][SC-139767][CONNECT] Make Streaming Queries work with Connect’s artifact management
[SPARK-44791] [DBRRM-462][SC-139623][CONNECT] Make ArrowDeserializer work with REPL generated classes
[SPARK-44876] [DBRRM-480][SC-140431][PYTHON] Fix Arrow-optimized Python UDF on Spark Connect
[SPARK-44877] [DBRRM-482][SC-140437][CONNECT][PYTHON] Support python protobuf functions for Spark Connect
[SPARK-44882] [DBRRM-463][SC-140430][PYTHON][CONNECT] Remove function uuid/random/chr from PySpark
[SPARK-44740] [DBRRM-462][SC-140320][CONNECT][FOLLOW] Fix metadata values for Artifacts
[SPARK-44822] [DBRRM-464][PYTHON][SQL] Make Python UDTFs by default non-deterministic
[SPARK-44836] [DBRRM-468][SC-140228][PYTHON] Refactor Arrow Python UDTF
[SPARK-44738] [DBRRM-462][SC-139347][PYTHON][CONNECT] Add missing client metadata to calls
[SPARK-44722] [DBRRM-462][SC-139306][CONNECT] ExecutePlanResponseReattachableIterator.calliter: AttributeError: ‘NoneType’ object has no attribute ‘message’
[SPARK-44625] [DBRRM-396][SC-139535][CONNECT] SparkConnectExecutionManager to track all executions
[SPARK-44663] [SC-139020][DBRRM-420][PYTHON] Disable arrow optimization by default for Python UDTFs
[SPARK-44709] [DBRRM-396][SC-139250][CONNECT] Run ExecuteGrpcResponseSender in reattachable execute in new thread to fix flow control
[SPARK-44656] [DBRRM-396][SC-138924][CONNECT] Make all iterators CloseableIterators
[SPARK-44671] [DBRRM-396][SC-138929][PYTHON][CONNECT] Retry ExecutePlan in case initial request didn’t reach server in Python client
[SPARK-44624] [DBRRM-396][SC-138919][CONNECT] Retry ExecutePlan in case initial request didn’t reach server
[SPARK-44574] [DBRRM-396][SC-138288][SQL][CONNECT] Errors that moved into sq/api should also use AnalysisException
[SPARK-44613] [DBRRM-396][SC-138473][CONNECT] Add Encoders object
[SPARK-44626] [DBRRM-396][SC-138828][SS][CONNECT] Followup on streaming query termination when client session is timed out for Spark Connect
[SPARK-44642] [DBRRM-396][SC-138882][CONNECT] ReleaseExecute in ExecutePlanResponseReattachableIterator after it gets error from server
[SPARK-41400] [DBRRM-396][SC-138287][CONNECT] Remove Connect Client Catalyst Dependency
[SPARK-44664] [DBRRM-396][PYTHON][CONNECT] Release the execute when closing the iterator in Python client
[SPARK-44631] [DBRRM-396][SC-138823][CONNECT][CORE][14.0.0] Remove session-based directory when the isolated session cache is evicted
[SPARK-42941] [DBRRM-396][SC-138389][SS][CONNECT] Python StreamingQueryListener
[SPARK-44636] [DBRRM-396][SC-138570][CONNECT] Leave no dangling iterators
[SPARK-44424] [DBRRM-396][CONNECT][PYTHON][14.0.0] Python client for reattaching to existing execute in Spark Connect
[SPARK-44637] [SC-138571] Synchronize accesses to ExecuteResponseObserver
[SPARK-44538] [SC-138178][CONNECT][SQL] Reinstate Row.jsonValue and friends
[SPARK-44421] [SC-138434][SPARK-44423][CONNECT] Reattachable execution in Spark Connect
[SPARK-44418] [SC-136807][PYTHON][CONNECT] Upgrade protobuf from 3.19.5 to 3.20.3
[SPARK-44587] [SC-138315][SQL][CONNECT] Increase protobuf marshaller recursion limit
[SPARK-44591] [SC-138292][CONNECT][SQL] Add jobTags to SparkListenerSQLExecutionStart
[SPARK-44610] [SC-138368][SQL] DeduplicateRelations should retain Alias metadata when creating a new instance
[SPARK-44542] [SC-138323][CORE] Eagerly load SparkExitCode class in exception handler
[SPARK-44264] [SC-138143][PYTHON]E2E Testing for Deepspeed
[SPARK-43997] [SC-138347][CONNECT] Add support for Java UDFs
[SPARK-44507] [SQL][CONNECT][14.x][14.0] Move AnalysisException to sql/api
[SPARK-44453] [SC-137013][PYTHON] Use difflib to display errors in assertDataFrameEqual
[SPARK-44394] [SC-138291][CONNECT][WEBUI][14.0] Add a Spark UI page for Spark Connect
[SPARK-44611] [SC-138415][CONNECT] Do not exclude scala-xml
[SPARK-44531] [SC-138044][CONNECT][SQL][14.x][14.0] Move encoder inference to sql/api
[SPARK-43744] [SC-138289][CONNECT][14.x][14.0] Fix class loading problem cau…
[SPARK-44590] [SC-138296][SQL][CONNECT] Remove the arrow batch record limit for SqlCommandResult
[SPARK-43968] [SC-138115][PYTHON] Improve error messages for Python UDTFs with wrong number of outputs
[SPARK-44432] [SC-138293][SS][CONNECT] Terminate streaming queries when a session times out in Spark Connect
[SPARK-44584] [SC-138295][CONNECT] Set client_type information for AddArtifactsRequest and ArtifactStatusesRequest in Scala Client
[SPARK-44552] [14.0][SC-138176][SQL] Remove
private object ParseState
definition fromIntervalUtils
[SPARK-43660] [SC-136183][CONNECT][PS] Enable
resample
with Spark Connect[SPARK-44287] [SC-136223][SQL] Use PartitionEvaluator API in RowToColumnarExec & ColumnarToRowExec SQL operators.
[SPARK-39634] [SC-137566][SQL] Allow file splitting in combination with row index generation
[SPARK-44533] [SC-138058][PYTHON] Add support for accumulator, broadcast, and Spark files in Python UDTF’s analyze
[SPARK-44479] [SC-138146][PYTHON] Fix ArrowStreamPandasUDFSerializer to accept no-column pandas DataFrame
[SPARK-44425] [SC-138177][CONNECT] Validate that user provided sessionId is an UUID
[SPARK-44535] [SC-138038][CONNECT][SQL] Move required Streaming API to sql/api
[SPARK-44264] [SC-136523][ML][PYTHON] Write a Deepspeed Distributed Learning Class DeepspeedTorchDistributor
[SPARK-42098] [SC-138164][SQL] Fix ResolveInlineTables can not handle with RuntimeReplaceable expression
[SPARK-44060] [SC-135693][SQL] Code-gen for build side outer shuffled hash join
[SPARK-44496] [SC-137682][SQL][CONNECT] Move Interfaces needed by SCSC to sql/api
[SPARK-44532] [SC-137893][CONNECT][SQL] Move ArrowUtils to sql/api
[SPARK-44413] [SC-137019][PYTHON] Clarify error for unsupported arg data type in assertDataFrameEqual
[SPARK-44530] [SC-138036][CORE][CONNECT] Move SparkBuildInfo to common/util
[SPARK-36612] [SC-133071][SQL] Support left outer join build left or right outer join build right in shuffled hash join
[SPARK-44519] [SC-137728][CONNECT] SparkConnectServerUtils generated incorrect parameters for jars
[SPARK-44449] [SC-137818][CONNECT] Upcasting for direct Arrow Deserialization
[SPARK-44131] [SC-136346][SQL] Add call_function and deprecate call_udf for Scala API
[SPARK-44541] [SQL] Remove useless function
hasRangeExprAgainstEventTimeCol
fromUnsupportedOperationChecker
[SPARK-44523] [SC-137859][SQL] Filter’s maxRows/maxRowsPerPartition is 0 if condition is FalseLiteral
[SPARK-44540] [SC-137873][UI] Remove unused stylesheet and javascript files of jsonFormatter
[SPARK-44466] [SC-137856][SQL] Exclude configs starting with
SPARK_DRIVER_PREFIX
andSPARK_EXECUTOR_PREFIX
from modifiedConfigs[SPARK-44477] [SC-137508][SQL] Treat TYPE_CHECK_FAILURE_WITH_HINT as an error subclass
[SPARK-44509] [SC-137855][PYTHON][CONNECT] Add job cancellation API set in Spark Connect Python client
[SPARK-44059] [SC-137023] Add analyzer support of named arguments for built-in functions
[SPARK-38476] [SC-136448][CORE] Use error class in org.apache.spark.storage
[SPARK-44486] [SC-137817][PYTHON][CONNECT] Implement PyArrow
self_destruct
feature fortoPandas
[SPARK-44361] [SC-137200][SQL] Use PartitionEvaluator API in MapInBatchExec
[SPARK-44510] [SC-137652][UI] Update dataTables to 1.13.5 and remove some unreached png files
[SPARK-44503] [SC-137808][SQL] Add SQL grammar for PARTITION BY and ORDER BY clause after TABLE arguments for TVF calls
[SPARK-38477] [SC-136319][CORE] Use error class in org.apache.spark.shuffle
[SPARK-44299] [SC-136088][SQL] Assign names to the error class LEGACYERROR_TEMP_227[4-6,8]
[SPARK-44422] [SC-137567][CONNECT] Spark Connect fine grained interrupt
[SPARK-44380] [SC-137415][SQL][PYTHON] Support for Python UDTF to analyze in Python
[SPARK-43923] [SC-137020][CONNECT] Post listenerBus events durin…
[SPARK-44303] [SC-136108][SQL] Assign names to the error class LEGACYERROR_TEMP_[2320-2324]
[SPARK-44294] [SC-135885][UI] Fix HeapHistogram column shows unexpectedly w/ select-all-box
[SPARK-44409] [SC-136975][SQL] Handle char/varchar in Dataset.to to keep consistent with others
[SPARK-44334] [SC-136576][SQL][UI] Status in the REST API response for a failed DDL/DML with no jobs should be FAILED rather than COMPLETED
[SPARK-42309] [SC-136703][SQL] Introduce
INCOMPATIBLE_DATA_TO_TABLE
and sub classes.[SPARK-44367] [SC-137418][SQL][UI] Show error message on UI for each failed query
[SPARK-44474] [SC-137195][CONNECT] Reenable “Test observe response” at SparkConnectServiceSuite
[SPARK-44320] [SC-136446][SQL] Assign names to the error class LEGACYERROR_TEMP_[1067,1150,1220,1265,1277]
[SPARK-44310] [SC-136055][CONNECT] The Connect Server startup log should display the hostname and port
[SPARK-44309] [SC-136193][UI] Display Add/Remove Time of Executors on Executors Tab
[SPARK-42898] [SC-137556][SQL] Mark that string/date casts do not need time zone id
[SPARK-44475] [SC-137422][SQL][CONNECT] Relocate DataType and Parser to sql/api
[SPARK-44484] [SC-137562][SS]Add batchDuration to StreamingQueryProgress json method
[SPARK-43966] [SC-137559][SQL][PYTHON] Support non-deterministic table-valued functions
[SPARK-44439] [SC-136973][CONNECT][SS]Fixed listListeners to only send ids back to client
[SPARK-44341] [SC-137054][SQL][PYTHON] Define the computing logic through PartitionEvaluator API and use it in WindowExec and WindowInPandasExec
[SPARK-43839] [SC-132680][SQL] Convert
_LEGACY_ERROR_TEMP_1337
toUNSUPPORTED_FEATURE.TIME_TRAVEL
[SPARK-44244] [SC-135703][SQL] Assign names to the error class LEGACYERROR_TEMP_[2305-2309]
[SPARK-44201] [SC-136778][CONNECT][SS]Add support for Streaming Listener in Scala for Spark Connect
[SPARK-44260] [SC-135618][SQL] Assign names to the error class LEGACYERROR_TEMP_[1215-1245-2329] & Use checkError() to check Exception in CharVarcharSuite
[SPARK-42454] [SC-136913][SQL] SPJ: encapsulate all SPJ related parameters in BatchScanExec
[SPARK-44292] [SC-135844][SQL] Assign names to the error class LEGACYERROR_TEMP_[2315-2319]
[SPARK-44396] [SC-137221][Connect] Direct Arrow Deserialization
[SPARK-44324] [SC-137172][SQL][CONNECT] Move CaseInsensitiveMap to sql/api
[SPARK-44395] [SC-136744][SQL] Add test back to StreamingTableSuite
[SPARK-44481] [SC-137401][CONNECT][PYTHON] Make pyspark.sql.is_remote an API
[SPARK-44278] [SC-137400][CONNECT] Implement a GRPC server interceptor that cleans up thread local properties
[SPARK-44264] [SC-137211][ML][PYTHON] Support Distributed Training of Functions Using Deepspeed
[SPARK-44430] [SC-136970][SQL] Add cause to
AnalysisException
when option is invalid[SPARK-44264] [SC-137167][ML][PYTHON] Incorporating FunctionPickler Into TorchDistributor
[SPARK-44216] [SC-137046] [PYTHON] Make assertSchemaEqual API public
[SPARK-44398] [SC-136720][CONNECT] Scala foreachBatch API
[SPARK-43203] [SC-134528][SQL] Move all Drop Table case to DataSource V2
[SPARK-43755] [SC-137171][CONNECT][MINOR] Open
AdaptiveSparkPlanHelper.allChildren
instead of using copy inMetricGenerator
[SPARK-44264] [SC-137187][ML][PYTHON] Refactoring TorchDistributor To Allow for Custom “run_training_on_file” Function Pointer
[SPARK-43755] [SC-136838][CONNECT] Move execution out of SparkExecutePlanStreamHandler and to a different thread
[SPARK-44411] [SC-137198][SQL] Use PartitionEvaluator API in ArrowEvalPythonExec and BatchEvalPythonExec
[SPARK-44375] [SC-137197][SQL] Use PartitionEvaluator API in DebugExec
[SPARK-43967] [SC-137057][PYTHON] Support regular Python UDTFs with empty return values
[SPARK-43915] [SC-134766][SQL] Assign names to the error class LEGACYERROR_TEMP_[2438-2445]
[SPARK-43965] [SC-136929][PYTHON][CONNECT] Support Python UDTF in Spark Connect
[SPARK-44154] [SC-137050][SQL] Added more unit tests to BitmapExpressionUtilsSuite and made minor improvements to Bitmap Aggregate Expressions
[SPARK-44169] [SC-135497][SQL] Assign names to the error class LEGACYERROR_TEMP_[2300-2304]
[SPARK-44353] [SC-136578][CONNECT][SQL] Remove StructType.toAttributes
[SPARK-43964] [SC-136676][SQL][PYTHON] Support arrow-optimized Python UDTFs
[SPARK-44321] [SC-136308][CONNECT] Decouple ParseException from AnalysisException
[SPARK-44348] [SAS-1910][SC-136644][CORE][CONNECT][PYTHON] Reenable test_artifact with relevant changes
[SPARK-44145] [SC-136698][SQL] Callback when ready for execution
[SPARK-43983] [SC-136404][PYTHON][ML][CONNECT] Enable cross validator estimator test
[SPARK-44399] [SC-136669][PYHTON][CONNECT] Import SparkSession in Python UDF only when useArrow is None
[SPARK-43631] [SC-135300][CONNECT][PS] Enable Series.interpolate with Spark Connect
[SPARK-44374] [SC-136544][PYTHON][ML] Add example code for distributed ML for spark connect
[SPARK-44282] [SC-135948][CONNECT] Prepare DataType parsing for use in Spark Connect Scala Client
[SPARK-44052] [SC-134469][CONNECT][PS] Add util to get proper Column or DataFrame class for Spark Connect.
[SPARK-43983] [SC-136404][PYTHON][ML][CONNECT] Implement cross validator estimator
[SPARK-44290] [SC-136300][CONNECT] Session-based files and archives in Spark Connect
[SPARK-43710] [SC-134860][PS][CONNECT] Support
functions.date_part
for Spark Connect[SPARK-44036] [SC-134036][CONNECT][PS] Cleanup & consolidate tickets to simplify the tasks.
[SPARK-44150] [SC-135790][PYTHON][CONNECT] Explicit Arrow casting for mismatched return type in Arrow Python UDF
[SPARK-43903] [SC-134754][PYTHON][CONNECT] Improve ArrayType input support in Arrow Python UDF
[SPARK-44250] [SC-135819][ML][PYTHON][CONNECT] Implement classification evaluator
[SPARK-44255] [SC-135704][SQL] Relocate StorageLevel to common/utils
[SPARK-42169] [SC-135735] [SQL] Implement code generation for to_csv function (StructsToCsv)
[SPARK-44249] [SC-135719][SQL][PYTHON] Refactor PythonUDTFRunner to send its return type separately
[SPARK-43353] [SC-132734][PYTHON] Migrate remaining session errors into error class
[SPARK-44133] [SC-134795][PYTHON] Upgrade MyPy from 0.920 to 0.982
[SPARK-42941] [SC-134707][SS][CONNECT][1/2] StreamingQueryListener - Event Serde in JSON format
[SPARK-43353] Revert “[SC-132734][ES-729763][PYTHON] Migrate remaining session errors into error class”
[SPARK-44100] [SC-134576][ML][CONNECT][PYTHON] Move namespace from
pyspark.mlv2
topyspark.ml.connect
[SPARK-44220] [SC-135484][SQL] Move StringConcat to sql/api
[SPARK-43992] [SC-133645][SQL][PYTHON][CONNECT] Add optional pattern for Catalog.listFunctions
[SPARK-43982] [SC-134529][ML][PYTHON][CONNECT] Implement pipeline estimator for ML on spark connect
[SPARK-43888] [SC-132893][CORE] Relocate Logging to common/utils
[SPARK-42941] Revert “[SC-134707][SS][CONNECT][1/2] StreamingQueryListener - Event Serde in JSON format”
[SPARK-43624] [SC-134557][PS][CONNECT] Add
EWM
to SparkConnectPlanner.[SPARK-43981] [SC-134137][PYTHON][ML] Basic saving / loading implementation for ML on spark connect
[SPARK-43205] [SC-133371][SQL] fix SQLQueryTestSuite
[SPARK-43376] Revert “[SC-130433][SQL] Improve reuse subquery with table cache”
[SPARK-44040] [SC-134366][SQL] Fix compute stats when AggregateExec node above QueryStageExec
[SPARK-43919] [SC-133374][SQL] Extract JSON functionality out of Row
[SPARK-42618] [SC-134433][PYTHON][PS] Warning for the pandas-related behavior changes in next major release
[SPARK-43893] [SC-133381][PYTHON][CONNECT] Non-atomic data type support in Arrow-optimized Python UDF
[SPARK-43627] [SC-134290][SPARK-43626][PS][CONNECT] Enable
pyspark.pandas.spark.functions.{kurt, skew}
in Spark Connect.[SPARK-43798] [SC-133990][SQL][PYTHON] Support Python user-defined table functions
[SPARK-43616] [SC-133849][PS][CONNECT] Enable
pyspark.pandas.spark.functions.mode
in Spark Connect[SPARK-43133] [SC-133728] Scala Client DataStreamWriter Foreach support
[SPARK-43684] [SC-134107][SPARK-43685][SPARK-43686][SPARK-43691][CONNECT][PS] Fix
(NullOps|NumOps).(eq|ne)
for Spark Connect.[SPARK-43645] [SC-134151][SPARK-43622][PS][CONNECT] Enable
pyspark.pandas.spark.functions.{var, stddev}
in Spark Connect[SPARK-43617] [SC-133893][PS][CONNECT] Enable
pyspark.pandas.spark.functions.product
in Spark Connect[SPARK-43610] [SC-133832][CONNECT][PS] Enable
InternalFrame.attach_distributed_column
in Spark Connect.[SPARK-43621] [SC-133852][PS][CONNECT] Enable
pyspark.pandas.spark.functions.repeat
in Spark Connect[SPARK-43921] [SC-133461][PROTOBUF] Generate Protobuf descriptor files at build time
[SPARK-43613] [SC-133727][PS][CONNECT] Enable
pyspark.pandas.spark.functions.covar
in Spark Connect[SPARK-43376] [SC-130433][SQL] Improve reuse subquery with table cache
[SPARK-43612] [SC-132011][CONNECT][PYTHON] Implement SparkSession.addArtifact(s) in Python client
[SPARK-43920] [SC-133611][SQL][CONNECT] Create sql/api module
[SPARK-43097] [SC-133372][ML] New pyspark ML logistic regression estimator implemented on top of distributor
[SPARK-43783] [SC-133240][SPARK-43784][SPARK-43788][ML] Make MLv2 (ML on spark connect) supports pandas >= 2.0
[SPARK-43024] [SC-132716][PYTHON] Upgrade pandas to 2.0.0
[SPARK-43881] [SC-133140][SQL][PYTHON][CONNECT] Add optional pattern for Catalog.listDatabases
[SPARK-39281] [SC-131422][SQL] Speed up Timestamp type inference with legacy format in JSON/CSV data source
[SPARK-43792] [SC-132887][SQL][PYTHON][CONNECT] Add optional pattern for Catalog.listCatalogs
[SPARK-43132] [SC-131623] [SS] [CONNECT] Python Client DataStreamWriter foreach() API
[SPARK-43545] [SC-132378][SQL][PYTHON] Support nested timestamp type
[SPARK-43353] [SC-132734][PYTHON] Migrate remaining session errors into error class
[SPARK-43304] [SC-129969][CONNECT][PYTHON] Migrate
NotImplementedError
intoPySparkNotImplementedError
[SPARK-43516] [SC-132202][ML][PYTHON][CONNECT] Base interfaces of sparkML for spark3.5: estimator/transformer/model/evaluator
[SPARK-43128] Revert “[SC-131628][CONNECT][SS] Make
recentProgress
andlastProgress
returnStreamingQueryProgress
consistent with the native Scala Api”[SPARK-43543] [SC-131839][PYTHON] Fix nested MapType behavior in Pandas UDF
[SPARK-38469] [SC-131425][CORE] Use error class in org.apache.spark.network
[SPARK-43309] [SC-129746][SPARK-38461][CORE] Extend INTERNAL_ERROR with categories and add error class INTERNAL_ERROR_BROADCAST
[SPARK-43265] [SC-129653] Move Error framework to a common utils module
[SPARK-43440] [SC-131229][PYTHON][CONNECT] Support registration of an Arrow-optimized Python UDF
[SPARK-43528] [SC-131531][SQL][PYTHON] Support duplicated field names in createDataFrame with pandas DataFrame
[SPARK-43412] [SC-130990][PYTHON][CONNECT] Introduce
SQL_ARROW_BATCHED_UDF
EvalType for Arrow-optimized Python UDFs[SPARK-40912] [SC-130986][CORE]Overhead of Exceptions in KryoDeserializationStream
[SPARK-39280] [SC-131206][SQL] Speed up Timestamp type inference with user-provided format in JSON/CSV data source
[SPARK-43473] [SC-131372][PYTHON] Support struct type in createDataFrame from pandas DataFrame
[SPARK-43443] [SC-131024][SQL] Add benchmark for Timestamp type inference when use invalid value
[SPARK-41532] [SC-130523][CONNECT][CLIENT] Add check for operations that involve multiple data frames
[SPARK-43296] [SC-130627][CONNECT][PYTHON] Migrate Spark Connect session errors into error class
[SPARK-43324] [SC-130455][SQL] Handle UPDATE commands for delta-based sources
[SPARK-43347] [SC-130148][PYTHON] Remove Python 3.7 Support
[SPARK-43292] [SC-130525][CORE][CONNECT] Move
ExecutorClassLoader
tocore
module and simplifyExecutor#addReplClassLoaderIfNeeded
[SPARK-43081] [SC-129900] [ML] [CONNECT] Add torch distributor data loader that loads data from spark partition data
[SPARK-43331] [SC-130061][CONNECT] Add Spark Connect SparkSession.interruptAll
[SPARK-43306] [SC-130320][PYTHON] Migrate
ValueError
from Spark SQL types into error class[SPARK-43261] [SC-129674][PYTHON] Migrate
TypeError
from Spark SQL types into error class.[SPARK-42992] [SC-129465][PYTHON] Introduce PySparkRuntimeError
[SPARK-16484] [SC-129975][SQL] Add support for Datasketches HllSketch
[SPARK-43165] [SC-128823][SQL] Move canWrite to DataTypeUtils
[SPARK-43082] [SC-129112][CONNECT][PYTHON] Arrow-optimized Python UDFs in Spark Connect
[SPARK-43084] [SC-128654] [SS] Add applyInPandasWithState support for spark connect
[SPARK-42657] [SC-128621][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts
[SPARK-43098] [SC-77059][SQL] Fix correctness COUNT bug when scalar subquery has group by clause
[SPARK-42884] [SC-126662][CONNECT] Add Ammonite REPL integration
[SPARK-42994] [SC-128333][ML][CONNECT] PyTorch Distributor support Local Mode
[SPARK-41498] [SC-125343]Revert ” Propagate metadata through Union”
[SPARK-42993] [SC-127829][ML][CONNECT] Make PyTorch Distributor compatible with Spark Connect
[SPARK-42683] [LC-75] Automatically rename conflicting metadata columns
[SPARK-42874] [SC-126442][SQL] Enable new golden file test framework for analysis for all input files
[SPARK-42779] [SC-126042][SQL] Allow V2 writes to indicate advisory shuffle partition size
[SPARK-42891] [SC-126458][CONNECT][PYTHON] Implement CoGrouped Map API
[SPARK-42791] [SC-126134][SQL] Create a new golden file test framework for analysis
[SPARK-42615] [SC-124237][CONNECT][PYTHON] Refactor the AnalyzePlan RPC and add
session.version
[SPARK-41302] Revert “[ALL TESTS][SC-122423][SQL] Assign name to LEGACYERROR_TEMP_1185”
[SPARK-40770] [SC-122652][PYTHON] Improved error messages for applyInPandas for schema mismatch
[SPARK-40770] Revert “[ALL TESTS][SC-122652][PYTHON] Improved error messages for applyInPandas for schema mismatch”
[SPARK-42398] [SC-123500][SQL] Refine default column value DS v2 interface
[SPARK-40770] [ALL TESTS][SC-122652][PYTHON] Improved error messages for applyInPandas for schema mismatch
[SPARK-40770] Revert “[SC-122652][PYTHON] Improved error messages for applyInPandas for schema mismatch”
[SPARK-40770] [SC-122652][PYTHON] Improved error messages for applyInPandas for schema mismatch
[SPARK-42038] [ALL TESTS] Revert “Revert “[SC-122533][SQL] SPJ: Support partially clustered distribution””
[SPARK-42038] Revert “[SC-122533][SQL] SPJ: Support partially clustered distribution”
[SPARK-42038] [SC-122533][SQL] SPJ: Support partially clustered distribution
[SPARK-40550] [SC-120989][SQL] DataSource V2: Handle DELETE commands for delta-based sources
[SPARK-40770] Revert “[SC-122652][PYTHON] Improved error messages for applyInPandas for schema mismatch”
[SPARK-40770] [SC-122652][PYTHON] Improved error messages for applyInPandas for schema mismatch
[SPARK-41302] Revert “[SC-122423][SQL] Assign name to LEGACYERROR_TEMP_1185”
[SPARK-40550] Revert “[SC-120989][SQL] DataSource V2: Handle DELETE commands for delta-based sources”
[SPARK-42123] Revert “[SC-121453][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output”
[SPARK-42146] [SC-121172][CORE] Refactor
Utils#setStringField
to make maven build pass when sql module use this method[SPARK-42119] Revert “[SC-121342][SQL] Add built-in table-valued functions inline and inline_outer”
Highlights
Fix
aes_decryp
t andln
functions in Connect SPARK-45109Fix inherited named tuples to work in createDataFrame SPARK-44980
CodeGenerator Cache is now classloader-specific [SPARK-44795]
Added
SparkListenerConnectOperationStarted.planRequest
[SPARK-44861]Make Streaming Queries work with Connect’s artifact management [SPARK-44794]
ArrowDeserializer works with REPL generated classes [SPARK-44791]
Fixed Arrow-optimized Python UDF on Spark Connect [SPARK-44876]
Scala and Go client support in Spark Connect SPARK-42554 SPARK-43351
PyTorch-based distributed ML Support for Spark Connect SPARK-42471
Structured Streaming support for Spark Connect in Python and Scala SPARK-42938
Pandas API support for the Python Spark Connect Client SPARK-42497
Introduce Arrow Python UDFs SPARK-40307
Support Python user-defined table functions SPARK-43798
Migrate PySpark errors onto error classes SPARK-42986
PySpark Test Framework SPARK-44042
Add support for Datasketches HllSketch SPARK-16484
Built-in SQL Function Improvement SPARK-41231
IDENTIFIER clause SPARK-43205
Add SQL functions into Scala, Python and R API SPARK-43907
Add named argument support for SQL functions SPARK-43922
Avoid unnecessary task rerun on decommissioned executor lost if shuffle data migrated SPARK-41469
Distributed ML <> spark connect SPARK-42471
DeepSpeed Distributor SPARK-44264
Implement changelog checkpointing for RocksDB state store SPARK-43421
Introduce watermark propagation among operators SPARK-42376
Introduce dropDuplicatesWithinWatermark SPARK-42931
RocksDB state store provider memory management enhancements SPARK-43311
Spark Connect
Refactoring of the sql module into sql and sql-api to produce a minimum set of dependencies that can be shared between the Scala Spark Connect client and Spark and avoids pulling all of the Spark transitive dependencies. SPARK-44273
Introducing the Scala client for Spark Connect SPARK-42554
Pandas API support for the Python Spark Connect Client SPARK-42497
PyTorch-based distributed ML Support for Spark Connect SPARK-42471
Structured Streaming support for Spark Connect in Python and Scala SPARK-42938
Initial version of the Go client SPARK-43351
Lot’s of compatibility improvements between Spark native and the Spark Connect clients across Python and Scala
Improved debugability and request handling for client applications (asynchronous processing, retries, long-lived queries)
Spark SQL
Features
Add metadata column file block start and length SPARK-42423
Support positional parameters in Scala/Java sql() SPARK-44066
Add named parameter support in parser for function calls SPARK-43922
Support SELECT DEFAULT with ORDER BY, LIMIT, OFFSET for INSERT source relation SPARK-43071
Add SQL grammar for PARTITION BY and ORDER BY clause after TABLE arguments for TVF calls SPARK-44503
Include column default values in DESCRIBE and SHOW CREATE TABLE output SPARK-42123
Add optional pattern for Catalog.listCatalogs SPARK-43792
Add optional pattern for Catalog.listDatabases SPARK-43881
Callback when ready for execution SPARK-44145
Support Insert By Name statement SPARK-42750
Add call_function for Scala API SPARK-44131
Stable derived column aliases SPARK-40822
Support general constant expressions as CREATE/REPLACE TABLE OPTIONS values SPARK-43529
Support subqueries with correlation through INTERSECT/EXCEPT SPARK-36124
IDENTIFIER clause SPARK-43205
ANSI MODE: Conv should return an error if the internal conversion overflows SPARK-42427
Functions
Add support for Datasketches HllSketch SPARK-16484
Support the CBC mode by aes_encrypt()/aes_decrypt() SPARK-43038
Support TABLE argument parser rule for TableValuedFunction SPARK-44200
Implement bitmap functions SPARK-44154
Add the try_aes_decrypt() function SPARK-42701
array_insert should fail with 0 index SPARK-43011
Add to_varchar alias for to_char SPARK-43815
High-order function: array_compact implementation SPARK-41235
Add analyzer support of named arguments for built-in functions SPARK-44059
Add NULLs for INSERTs with user-specified lists of fewer columns than the target table SPARK-42521
Adds support for aes_encrypt IVs and AAD SPARK-43290
DECODE function returns wrong results when passed NULL SPARK-41668
Support udf ‘luhn_check’ SPARK-42191
Support implicit lateral column alias resolution on Aggregate SPARK-41631
Support implicit lateral column alias in queries with Window SPARK-42217
Add 3-args function aliases DATE_ADD and DATE_DIFF SPARK-43492
Data Sources
Char/Varchar Support for JDBC Catalog SPARK-42904
Support Get SQL Keywords Dynamically Thru JDBC API and TVF SPARK-43119
DataSource V2: Handle MERGE commands for delta-based sources SPARK-43885
DataSource V2: Handle MERGE commands for group-based sources SPARK-43963
DataSource V2: Handle UPDATE commands for group-based sources SPARK-43975
DataSource V2: Allow representing updates as deletes and inserts SPARK-43775
Allow jdbc dialects to override the query used to create a table SPARK-41516
SPJ: Support partially clustered distribution SPARK-42038
DSv2 allows CTAS/RTAS to reserve schema nullability SPARK-43390
Add spark.sql.files.maxPartitionNum SPARK-44021
Handle UPDATE commands for delta-based sources SPARK-43324
Allow V2 writes to indicate advisory shuffle partition size SPARK-42779
Support lz4raw compression codec for Parquet SPARK-43273
Avro: writing complex unions SPARK-25050
Speed up Timestamp type inference with user-provided format in JSON/CSV data source SPARK-39280
Avro to Support custom decimal type backed by Long SPARK-43901
Avoid shuffle in Storage-Partitioned Join when partition keys mismatch, but join expressions are compatible SPARK-41413
Change binary to unsupported dataType in CSV format SPARK-42237
Allow Avro to convert union type to SQL with field name stable with type SPARK-43333
Speed up Timestamp type inference with legacy format in JSON/CSV data source SPARK-39281
Query Optimization
Subexpression elimination support shortcut expression SPARK-42815
Improve join stats estimation if one side can keep uniqueness SPARK-39851
Introduce the group limit of Window for rank-based filter to optimize top-k computation SPARK-37099
Fix behavior of null IN (empty list) in optimization rules SPARK-44431
Infer and push down window limit through window if partitionSpec is empty SPARK-41171
Remove the outer join if they are all distinct aggregate functions SPARK-42583
Collapse two adjacent windows with the same partition/order in subquery SPARK-42525
Push down limit through Python UDFs SPARK-42115
Optimize the order of filtering predicates SPARK-40045
Code Generation and Query Execution
Runtime filter should supports multi level shuffle join side as filter creation side SPARK-41674
Codegen Support for HiveSimpleUDF SPARK-42052
Codegen Support for HiveGenericUDF SPARK-42051
Codegen Support for build side outer shuffled hash join SPARK-44060
Implement code generation for to_csv function (StructsToCsv) SPARK-42169
Make AQE support InMemoryTableScanExec SPARK-42101
Support left outer join build left or right outer join build right in shuffled hash join SPARK-36612
Respect RequiresDistributionAndOrdering in CTAS/RTAS SPARK-43088
Coalesce buckets in join applied on broadcast join stream side SPARK-43107
Set nullable correctly on coalesced join key in full outer USING join SPARK-44251
Fix IN subquery ListQuery nullability SPARK-43413
Other Notable Changes
Set nullable correctly for keys in USING joins SPARK-43718
Fix COUNT(*) is null bug in correlated scalar subquery SPARK-43156
Dataframe.joinWith outer-join should return a null value for unmatched row SPARK-37829
Automatically rename conflicting metadata columns SPARK-42683
Document the Spark SQL error classes in user-facing documentation SPARK-42706
PySpark
Features
Support positional parameters in Python sql() SPARK-44140
Support parameterized SQL by sql() SPARK-41666
Support Python user-defined table functions SPARK-43797
Support to set Python executable for UDF and pandas function APIs in workers during runtime SPARK-43574
Add DataFrame.offset to PySpark SPARK-43213
Implement dir() in pyspark.sql.dataframe.DataFrame to include columns SPARK-43270
Add option to use large variable width vectors for arrow UDF operations SPARK-39979
Make mapInPandas / mapInArrow support barrier mode execution SPARK-42896
Add JobTag APIs to PySpark SparkContext SPARK-44194
Support for Python UDTF to analyze in Python SPARK-44380
Expose TimestampNTZType in pyspark.sql.types SPARK-43759
Support nested timestamp type SPARK-43545
Support UserDefinedType in createDataFrame from pandas DataFrame and toPandas [[SPARK-43817](https://issues.apache.org/jira/browse/SPARK-43817)][SPARK-43702]https://issues.apache.org/jira/browse/SPARK-43702)
Add descriptor binary option to Pyspark Protobuf API SPARK-43799
Accept generics tuple as typing hints of Pandas UDF SPARK-43886
Add array_prepend function SPARK-41233
Add assertDataFrameEqual util function SPARK-44061
Support arrow-optimized Python UDTFs SPARK-43964
Allow custom precision for fp approx equality SPARK-44217
Make assertSchemaEqual API public SPARK-44216
Support fill_value for ps.Series SPARK-42094
Support struct type in createDataFrame from pandas DataFrame SPARK-43473
Other Notable Changes
Add autocomplete support for df[|] in pyspark.sql.dataframe.DataFrame [SPARK-43892]
Deprecate & remove the APIs that will be removed in pandas 2.0 [SPARK-42593]
Make Python the first tab for code examples - Spark SQL, DataFrames and Datasets Guide SPARK-42493
Updating remaining Spark documentation code examples to show Python by default SPARK-42642
Use deduplicated field names when creating Arrow RecordBatch [SPARK-41971]
Support duplicated field names in createDataFrame with pandas DataFrame [SPARK-43528]
Allow columns parameter when creating DataFrame with Series [SPARK-42194]
Core
Schedule mergeFinalize when push merge shuffleMapStage retry but no running tasks SPARK-40082
Introduce PartitionEvaluator for SQL operator execution SPARK-43061
Allow ShuffleDriverComponent to declare if shuffle data is reliably stored SPARK-42689
Add max attempts limitation for stages to avoid potential infinite retry SPARK-42577
Support log level configuration with static Spark conf SPARK-43782
Optimize PercentileHeap SPARK-42528
Add reason argument to TaskScheduler.cancelTasks SPARK-42602
Avoid unnecessary task rerun on decommissioned executor lost if shuffle data migrated SPARK-41469
Fixing accumulator undercount in the case of the retry task with rdd cache SPARK-41497
Use RocksDB for spark.history.store.hybridStore.diskBackend by default SPARK-42277
NonFateSharingCache wrapper for Guava Cache SPARK-43300
Improve the performance of MapOutputTracker.updateMapOutput SPARK-43043
Allowing apps to control whether their metadata gets saved in the db by the External Shuffle Service SPARK-43179
Add SPARK_DRIVER_POD_IP env variable to executor pods SPARK-42769
Mounts the hadoop config map on the executor pod SPARK-43504
Structured Streaming
Add support for tracking pinned blocks memory usage for RocksDB state store SPARK-43120
Add RocksDB state store provider memory management enhancements SPARK-43311
Introduce dropDuplicatesWithinWatermark SPARK-42931
Introduce a new callback onQueryIdle() to StreamingQueryListener SPARK-43183
Add option to skip commit coordinator as part of StreamingWrite API for DSv2 sources/sinks SPARK-42968
Introduce a new callback “onQueryIdle” to StreamingQueryListener SPARK-43183
Implement Changelog based Checkpointing for RocksDB State Store Provider SPARK-43421
Add support for WRITE_FLUSH_BYTES for RocksDB used in streaming stateful operators SPARK-42792
Add support for setting max_write_buffer_number and write_buffer_size for RocksDB used in streaming SPARK-42819
RocksDB StateStore lock acquisition should happen after getting input iterator from inputRDD SPARK-42566
Introduce watermark propagation among operators SPARK-42376
Cleanup orphan sst and log files in RocksDB checkpoint directory SPARK-42353
Expand QueryTerminatedEvent to contain error class if it exists in exception SPARK-43482
ML
Support Distributed Training of Functions Using Deepspeed SPARK-44264
Base interfaces of sparkML for spark3.5: estimator/transformer/model/evaluator SPARK-43516
Make MLv2 (ML on spark connect) supports pandas >= 2.0 SPARK-43783
Update MLv2 Transformer interfaces SPARK-43516
New pyspark ML logistic regression estimator implemented on top of distributor SPARK-43097
Add Classifier.getNumClasses back SPARK-42526
Write a Deepspeed Distributed Learning Class DeepspeedTorchDistributor SPARK-44264
Basic saving / loading implementation for ML on spark connect SPARK-43981
Improve logistic regression model saving SPARK-43097
Implement pipeline estimator for ML on spark connect SPARK-43982
Implement cross validator estimator SPARK-43983
Implement classification evaluator SPARK-44250
Make PyTorch Distributor compatible with Spark Connect SPARK-42993
UI
Add a Spark UI page for Spark Connect SPARK-44394
Support Heap Histogram column in Executors tab SPARK-44153
Show error message on UI for each failed query SPARK-44367
Display Add/Remove Time of Executors on Executors Tab SPARK-44309
Build and Others
Remove Python 3.7 Support SPARK-43347
Increate PyArrow minimum version to 4.0.0 SPARK-44183
Support R 4.3.1 SPARK-43447 SPARK-44192
Add JobTag APIs to SparkR SparkContext SPARK-44195
Add math functions to SparkR SPARK-44349
Upgrade Parquet to 1.13.1 SPARK-43519
Upgrade ASM to 9.5 SPARK-43537 SPARK-43588
Upgrade rocksdbjni to 8.3.2 SPARK-41569 SPARK-42718 SPARK-43007 SPARK-43436 SPARK-44256
Upgrade Netty to 4.1.93 SPARK-42218 SPARK-42417 SPARK-42487 SPARK-43609 SPARK-44128
Upgrade zstd-jni to 1.5.5-5 SPARK-42409 SPARK-42625 SPARK-43080 SPARK-43294 SPARK-43737 SPARK-43994 SPARK-44465
Upgrade dropwizard metrics 4.2.19 SPARK-42654 SPARK-43738 SPARK-44296
Upgrade gcs-connector to 2.2.14 SPARK-42888 SPARK-43842
Upgrade commons-crypto to 1.2.0 SPARK-42488
Upgrade scala-parser-combinators from 2.1.1 to 2.2.0 SPARK-42489
Upgrade protobuf-java to 3.23.4 SPARK-41711 SPARK-42490 SPARK-42798 SPARK-43899 SPARK-44382
Upgrade commons-codec to 1.16.0 SPARK-44151
Upgrade Apache Kafka to 3.4.1 SPARK-42396 SPARK-44181
Upgrade RoaringBitmap to 0.9.45 SPARK-42385 SPARK-43495 SPARK-44221
Update ORC to 1.9.0 SPARK-42820 SPARK-44053 SPARK-44231
Upgrade to Avro 1.11.2 SPARK-44277
Upgrade commons-compress to 1.23.0 SPARK-43102
Upgrade joda-time from 2.12.2 to 2.12.5 SPARK-43008
Upgrade snappy-java to 1.1.10.3 SPARK-42242 SPARK-43758 SPARK-44070 SPARK-44415 SPARK-44513
Upgrade mysql-connector-java from 8.0.31 to 8.0.32 SPARK-42717
Upgrade Apache Arrow to 12.0.1 SPARK-42161 SPARK-43446 SPARK-44094
Upgrade commons-io to 2.12.0 SPARK-43739
Upgrade Apache commons-io to 2.13.0 SPARK-43739 SPARK-44028
Upgrade FasterXML jackson to 2.15.2 SPARK-42354 SPARK-43774 SPARK-43904
Upgrade log4j2 to 2.20.0 SPARK-42536
Upgrade slf4j to 2.0.7 SPARK-42871
Upgrade numpy and pandas in the release Dockerfile SPARK-42524
Upgrade Jersey to 2.40 SPARK-44316
Upgrade H2 from 2.1.214 to 2.2.220 SPARK-44393
Upgrade optionator to ^0.9.3 SPARK-44279
Upgrade bcprov-jdk15on and bcpkix-jdk15on to 1.70 SPARK-44441
Upgrade mlflow to 2.3.1 SPARK-43344
Upgrade Tink to 1.9.0 SPARK-42780
Upgrade silencer to 1.7.13 SPARK-41787 SPARK-44031
Upgrade Ammonite to 2.5.9 SPARK-44041
Upgrade Scala to 2.12.18 SPARK-43832
Upgrade org.scalatestplus:selenium-4-4 to org.scalatestplus:selenium-4-7 SPARK-41587
Upgrade minimatch to 3.1.2 SPARK-41634
Upgrade sbt-assembly from 2.0.0 to 2.1.0 SPARK-41704
Update maven-checkstyle-plugin from 3.1.2 to 3.2.0 SPARK-41714
Upgrade dev.ludovic.netlib to 3.0.3 SPARK-41750
Upgrade hive-storage-api to 2.8.1 SPARK-41798
Upgrade Apache httpcore to 4.4.16 SPARK-41802
Upgrade jetty to 9.4.52.v20230823 SPARK-45052
Upgrade compress-lzf to 1.1.2 SPARK-42274
Databricks ODBC/JDBC driver support
Databricks supports ODBC/JDBC drivers released in the past 2 years. Please download the recently released drivers and upgrade (download ODBC, download JDBC).
System environment
Operating System: Ubuntu 22.04.3 LTS
Java: Zulu 8.70.0.23-CA-linux64
Scala: 2.12.15
Python: 3.10.12
R: 4.3.1
Delta Lake: 2.4.0
Installed Python libraries
Library |
Version |
Library |
Version |
Library |
Version |
---|---|---|---|---|---|
anyio |
3.5.0 |
argon2-cffi |
21.3.0 |
argon2-cffi-bindings |
21.2.0 |
asttokens |
2.0.5 |
attrs |
22.1.0 |
backcall |
0.2.0 |
beautifulsoup4 |
4.11.1 |
black |
22.6.0 |
bleach |
4.1.0 |
blinker |
1.4 |
boto3 |
1.24.28 |
botocore |
1.27.96 |
certifi |
2022.12.7 |
cffi |
1.15.1 |
chardet |
4.0.0 |
charset-normalizer |
2.0.4 |
click |
8.0.4 |
comm |
0.1.2 |
contourpy |
1.0.5 |
cryptography |
39.0.1 |
cycler |
0.11.0 |
Cython |
0.29.32 |
databricks-sdk |
0.1.6 |
dbus-python |
1.2.18 |
debugpy |
1.6.7 |
decorator |
5.1.1 |
defusedxml |
0.7.1 |
distlib |
0.3.7 |
docstring-to-markdown |
0.11 |
entrypoints |
0.4 |
executing |
0.8.3 |
facets-overview |
1.1.1 |
fastjsonschema |
2.18.0 |
filelock |
3.12.2 |
fonttools |
4.25.0 |
GCC runtime library |
1.10.0 |
googleapis-common-protos |
1.60.0 |
grpcio |
1.48.2 |
grpcio-status |
1.48.1 |
httplib2 |
0.20.2 |
idna |
3.4 |
importlib-metadata |
4.6.4 |
ipykernel |
6.25.0 |
ipython |
8.14.0 |
ipython-genutils |
0.2.0 |
ipywidgets |
7.7.2 |
jedi |
0.18.1 |
jeepney |
0.7.1 |
Jinja2 |
3.1.2 |
jmespath |
0.10.0 |
joblib |
1.2.0 |
jsonschema |
4.17.3 |
jupyter-client |
7.3.4 |
jupyter-server |
1.23.4 |
jupyter_core |
5.2.0 |
jupyterlab-pygments |
0.1.2 |
jupyterlab-widgets |
1.0.0 |
keyring |
23.5.0 |
kiwisolver |
1.4.4 |
launchpadlib |
1.10.16 |
lazr.restfulclient |
0.14.4 |
lazr.uri |
1.0.6 |
lxml |
4.9.1 |
MarkupSafe |
2.1.1 |
matplotlib |
3.7.0 |
matplotlib-inline |
0.1.6 |
mccabe |
0.7.0 |
mistune |
0.8.4 |
more-itertools |
8.10.0 |
mypy-extensions |
0.4.3 |
nbclassic |
0.5.2 |
nbclient |
0.5.13 |
nbconvert |
6.5.4 |
nbformat |
5.7.0 |
nest-asyncio |
1.5.6 |
nodeenv |
1.8.0 |
notebook |
6.5.2 |
notebook_shim |
0.2.2 |
numpy |
1.23.5 |
oauthlib |
3.2.0 |
packaging |
22.0 |
pandas |
1.5.3 |
pandocfilters |
1.5.0 |
parso |
0.8.3 |
pathspec |
0.10.3 |
patsy |
0.5.3 |
pexpect |
4.8.0 |
pickleshare |
0.7.5 |
Pillow |
9.4.0 |
pip |
22.3.1 |
platformdirs |
2.5.2 |
plotly |
5.9.0 |
pluggy |
1.0.0 |
prometheus-client |
0.14.1 |
prompt-toolkit |
3.0.36 |
protobuf |
4.24.0 |
psutil |
5.9.0 |
psycopg2 |
2.9.3 |
ptyprocess |
0.7.0 |
pure-eval |
0.2.2 |
pyarrow |
8.0.0 |
pycparser |
2.21 |
pydantic |
1.10.6 |
pyflakes |
3.0.1 |
Pygments |
2.11.2 |
PyGObject |
3.42.1 |
PyJWT |
2.3.0 |
pyodbc |
4.0.32 |
pyparsing |
3.0.9 |
pyright |
1.1.294 |
pyrsistent |
0.18.0 |
python-dateutil |
2.8.2 |
python-lsp-jsonrpc |
1.0.0 |
python-lsp-server |
1.7.1 |
pytoolconfig |
1.2.5 |
pytz |
2022.7 |
pyzmq |
23.2.0 |
requests |
2.28.1 |
rope |
1.7.0 |
s3transfer |
0.6.1 |
scikit-learn |
1.1.1 |
seaborn |
0.12.2 |
SecretStorage |
3.3.1 |
Send2Trash |
1.8.0 |
setuptools |
65.6.3 |
six |
1.16.0 |
sniffio |
1.2.0 |
soupsieve |
2.3.2.post1 |
ssh-import-id |
5.11 |
stack-data |
0.2.0 |
statsmodels |
0.13.5 |
tenacity |
8.1.0 |
terminado |
0.17.1 |
threadpoolctl |
2.2.0 |
tinycss2 |
1.2.1 |
tokenize-rt |
4.2.1 |
tomli |
2.0.1 |
tornado |
6.1 |
traitlets |
5.7.1 |
typing_extensions |
4.4.0 |
ujson |
5.4.0 |
unattended-upgrades |
0.1 |
urllib3 |
1.26.14 |
virtualenv |
20.16.7 |
wadllib |
1.3.6 |
wcwidth |
0.2.5 |
webencodings |
0.5.1 |
websocket-client |
0.58.0 |
whatthepatch |
1.0.2 |
wheel |
0.38.4 |
widgetsnbextension |
3.6.1 |
yapf |
0.31.0 |
zipp |
1.0.0 |
Installed R libraries
R libraries are installed from the Posit Package Manager CRAN snapshot on 2023-07-13.
Library |
Version |
Library |
Version |
Library |
Version |
---|---|---|---|---|---|
arrow |
12.0.1 |
askpass |
1.1 |
assertthat |
0.2.1 |
backports |
1.4.1 |
base |
4.3.1 |
base64enc |
0.1-3 |
bit |
4.0.5 |
bit64 |
4.0.5 |
blob |
1.2.4 |
boot |
1.3-28 |
brew |
1.0-8 |
brio |
1.1.3 |
broom |
1.0.5 |
bslib |
0.5.0 |
cachem |
1.0.8 |
callr |
3.7.3 |
caret |
6.0-94 |
cellranger |
1.1.0 |
chron |
2.3-61 |
class |
7.3-22 |
cli |
3.6.1 |
clipr |
0.8.0 |
clock |
0.7.0 |
cluster |
2.1.4 |
codetools |
0.2-19 |
colorspace |
2.1-0 |
commonmark |
1.9.0 |
compiler |
4.3.1 |
config |
0.3.1 |
conflicted |
1.2.0 |
cpp11 |
0.4.4 |
crayon |
1.5.2 |
credentials |
1.3.2 |
curl |
5.0.1 |
data.table |
1.14.8 |
datasets |
4.3.1 |
DBI |
1.1.3 |
dbplyr |
2.3.3 |
desc |
1.4.2 |
devtools |
2.4.5 |
diagram |
1.6.5 |
diffobj |
0.3.5 |
digest |
0.6.33 |
downlit |
0.4.3 |
dplyr |
1.1.2 |
dtplyr |
1.3.1 |
e1071 |
1.7-13 |
ellipsis |
0.3.2 |
evaluate |
0.21 |
fansi |
1.0.4 |
farver |
2.1.1 |
fastmap |
1.1.1 |
fontawesome |
0.5.1 |
forcats |
1.0.0 |
foreach |
1.5.2 |
foreign |
0.8-82 |
forge |
0.2.0 |
fs |
1.6.2 |
future |
1.33.0 |
future.apply |
1.11.0 |
gargle |
1.5.1 |
generics |
0.1.3 |
gert |
1.9.2 |
ggplot2 |
3.4.2 |
gh |
1.4.0 |
gitcreds |
0.1.2 |
glmnet |
4.1-7 |
globals |
0.16.2 |
glue |
1.6.2 |
googledrive |
2.1.1 |
googlesheets4 |
1.1.1 |
gower |
1.0.1 |
graphics |
4.3.1 |
grDevices |
4.3.1 |
grid |
4.3.1 |
gridExtra |
2.3 |
gsubfn |
0.7 |
gtable |
0.3.3 |
hardhat |
1.3.0 |
haven |
2.5.3 |
highr |
0.10 |
hms |
1.1.3 |
htmltools |
0.5.5 |
htmlwidgets |
1.6.2 |
httpuv |
1.6.11 |
httr |
1.4.6 |
httr2 |
0.2.3 |
ids |
1.0.1 |
ini |
0.3.1 |
ipred |
0.9-14 |
isoband |
0.2.7 |
iterators |
1.0.14 |
jquerylib |
0.1.4 |
jsonlite |
1.8.7 |
KernSmooth |
2.23-21 |
knitr |
1.43 |
labeling |
0.4.2 |
later |
1.3.1 |
lattice |
0.21-8 |
lava |
1.7.2.1 |
lifecycle |
1.0.3 |
listenv |
0.9.0 |
lubridate |
1.9.2 |
magrittr |
2.0.3 |
markdown |
1.7 |
MASS |
7.3-60 |
Matrix |
1.5-4.1 |
memoise |
2.0.1 |
methods |
4.3.1 |
mgcv |
1.8-42 |
mime |
0.12 |
miniUI |
0.1.1.1 |
ModelMetrics |
1.2.2.2 |
modelr |
0.1.11 |
munsell |
0.5.0 |
nlme |
3.1-162 |
nnet |
7.3-19 |
numDeriv |
2016.8-1.1 |
openssl |
2.0.6 |
parallel |
4.3.1 |
parallelly |
1.36.0 |
pillar |
1.9.0 |
pkgbuild |
1.4.2 |
pkgconfig |
2.0.3 |
pkgdown |
2.0.7 |
pkgload |
1.3.2.1 |
plogr |
0.2.0 |
plyr |
1.8.8 |
praise |
1.0.0 |
prettyunits |
1.1.1 |
pROC |
1.18.4 |
processx |
3.8.2 |
prodlim |
2023.03.31 |
profvis |
0.3.8 |
progress |
1.2.2 |
progressr |
0.13.0 |
promises |
1.2.0.1 |
proto |
1.0.0 |
proxy |
0.4-27 |
ps |
1.7.5 |
purrr |
1.0.1 |
r2d3 |
0.2.6 |
R6 |
2.5.1 |
ragg |
1.2.5 |
randomForest |
4.7-1.1 |
rappdirs |
0.3.3 |
rcmdcheck |
1.4.0 |
RColorBrewer |
1.1-3 |
Rcpp |
1.0.11 |
RcppEigen |
0.3.3.9.3 |
readr |
2.1.4 |
readxl |
1.4.3 |
recipes |
1.0.6 |
rematch |
1.0.1 |
rematch2 |
2.1.2 |
remotes |
2.4.2 |
reprex |
2.0.2 |
reshape2 |
1.4.4 |
rlang |
1.1.1 |
rmarkdown |
2.23 |
RODBC |
1.3-20 |
roxygen2 |
7.2.3 |
rpart |
4.1.19 |
rprojroot |
2.0.3 |
Rserve |
1.8-11 |
RSQLite |
2.3.1 |
rstudioapi |
0.15.0 |
rversions |
2.1.2 |
rvest |
1.0.3 |
sass |
0.4.6 |
scales |
1.2.1 |
selectr |
0.4-2 |
sessioninfo |
1.2.2 |
shape |
1.4.6 |
shiny |
1.7.4.1 |
sourcetools |
0.1.7-1 |
sparklyr |
1.8.1 |
SparkR |
3.5.0 |
spatial |
7.3-15 |
splines |
4.3.1 |
sqldf |
0.4-11 |
SQUAREM |
2021.1 |
stats |
4.3.1 |
stats4 |
4.3.1 |
stringi |
1.7.12 |
stringr |
1.5.0 |
survival |
3.5-5 |
sys |
3.4.2 |
systemfonts |
1.0.4 |
tcltk |
4.3.1 |
testthat |
3.1.10 |
textshaping |
0.3.6 |
tibble |
3.2.1 |
tidyr |
1.3.0 |
tidyselect |
1.2.0 |
tidyverse |
2.0.0 |
timechange |
0.2.0 |
timeDate |
4022.108 |
tinytex |
0.45 |
tools |
4.3.1 |
tzdb |
0.4.0 |
urlchecker |
1.0.1 |
usethis |
2.2.2 |
utf8 |
1.2.3 |
utils |
4.3.1 |
uuid |
1.1-0 |
vctrs |
0.6.3 |
viridisLite |
0.4.2 |
vroom |
1.6.3 |
waldo |
0.5.1 |
whisker |
0.4.1 |
withr |
2.5.0 |
xfun |
0.39 |
xml2 |
1.3.5 |
xopen |
1.0.0 |
xtable |
1.8-4 |
yaml |
2.3.7 |
zip |
2.3.0 |
Installed Java and Scala libraries (Scala 2.12 cluster version)
Group ID |
Artifact ID |
Version |
---|---|---|
antlr |
antlr |
2.7.7 |
com.amazonaws |
amazon-kinesis-client |
1.12.0 |
com.amazonaws |
aws-java-sdk-autoscaling |
1.12.390 |
com.amazonaws |
aws-java-sdk-cloudformation |
1.12.390 |
com.amazonaws |
aws-java-sdk-cloudfront |
1.12.390 |
com.amazonaws |
aws-java-sdk-cloudhsm |
1.12.390 |
com.amazonaws |
aws-java-sdk-cloudsearch |
1.12.390 |
com.amazonaws |
aws-java-sdk-cloudtrail |
1.12.390 |
com.amazonaws |
aws-java-sdk-cloudwatch |
1.12.390 |
com.amazonaws |
aws-java-sdk-cloudwatchmetrics |
1.12.390 |
com.amazonaws |
aws-java-sdk-codedeploy |
1.12.390 |
com.amazonaws |
aws-java-sdk-cognitoidentity |
1.12.390 |
com.amazonaws |
aws-java-sdk-cognitosync |
1.12.390 |
com.amazonaws |
aws-java-sdk-config |
1.12.390 |
com.amazonaws |
aws-java-sdk-core |
1.12.390 |
com.amazonaws |
aws-java-sdk-datapipeline |
1.12.390 |
com.amazonaws |
aws-java-sdk-directconnect |
1.12.390 |
com.amazonaws |
aws-java-sdk-directory |
1.12.390 |
com.amazonaws |
aws-java-sdk-dynamodb |
1.12.390 |
com.amazonaws |
aws-java-sdk-ec2 |
1.12.390 |
com.amazonaws |
aws-java-sdk-ecs |
1.12.390 |
com.amazonaws |
aws-java-sdk-efs |
1.12.390 |
com.amazonaws |
aws-java-sdk-elasticache |
1.12.390 |
com.amazonaws |
aws-java-sdk-elasticbeanstalk |
1.12.390 |
com.amazonaws |
aws-java-sdk-elasticloadbalancing |
1.12.390 |
com.amazonaws |
aws-java-sdk-elastictranscoder |
1.12.390 |
com.amazonaws |
aws-java-sdk-emr |
1.12.390 |
com.amazonaws |
aws-java-sdk-glacier |
1.12.390 |
com.amazonaws |
aws-java-sdk-glue |
1.12.390 |
com.amazonaws |
aws-java-sdk-iam |
1.12.390 |
com.amazonaws |
aws-java-sdk-importexport |
1.12.390 |
com.amazonaws |
aws-java-sdk-kinesis |
1.12.390 |
com.amazonaws |
aws-java-sdk-kms |
1.12.390 |
com.amazonaws |
aws-java-sdk-lambda |
1.12.390 |
com.amazonaws |
aws-java-sdk-logs |
1.12.390 |
com.amazonaws |
aws-java-sdk-machinelearning |
1.12.390 |
com.amazonaws |
aws-java-sdk-opsworks |
1.12.390 |
com.amazonaws |
aws-java-sdk-rds |
1.12.390 |
com.amazonaws |
aws-java-sdk-redshift |
1.12.390 |
com.amazonaws |
aws-java-sdk-route53 |
1.12.390 |
com.amazonaws |
aws-java-sdk-s3 |
1.12.390 |
com.amazonaws |
aws-java-sdk-ses |
1.12.390 |
com.amazonaws |
aws-java-sdk-simpledb |
1.12.390 |
com.amazonaws |
aws-java-sdk-simpleworkflow |
1.12.390 |
com.amazonaws |
aws-java-sdk-sns |
1.12.390 |
com.amazonaws |
aws-java-sdk-sqs |
1.12.390 |
com.amazonaws |
aws-java-sdk-ssm |
1.12.390 |
com.amazonaws |
aws-java-sdk-storagegateway |
1.12.390 |
com.amazonaws |
aws-java-sdk-sts |
1.12.390 |
com.amazonaws |
aws-java-sdk-support |
1.12.390 |
com.amazonaws |
aws-java-sdk-swf-libraries |
1.11.22 |
com.amazonaws |
aws-java-sdk-workspaces |
1.12.390 |
com.amazonaws |
jmespath-java |
1.12.390 |
com.clearspring.analytics |
stream |
2.9.6 |
com.databricks |
Rserve |
1.8-3 |
com.databricks |
databricks-sdk-java |
0.2.0 |
com.databricks |
jets3t |
0.7.1-0 |
com.databricks.scalapb |
compilerplugin_2.12 |
0.4.15-10 |
com.databricks.scalapb |
scalapb-runtime_2.12 |
0.4.15-10 |
com.esotericsoftware |
kryo-shaded |
4.0.2 |
com.esotericsoftware |
minlog |
1.3.0 |
com.fasterxml |
classmate |
1.3.4 |
com.fasterxml.jackson.core |
jackson-annotations |
2.15.2 |
com.fasterxml.jackson.core |
jackson-core |
2.15.2 |
com.fasterxml.jackson.core |
jackson-databind |
2.15.2 |
com.fasterxml.jackson.dataformat |
jackson-dataformat-cbor |
2.15.2 |
com.fasterxml.jackson.datatype |
jackson-datatype-joda |
2.15.2 |
com.fasterxml.jackson.datatype |
jackson-datatype-jsr310 |
2.15.1 |
com.fasterxml.jackson.module |
jackson-module-paranamer |
2.15.2 |
com.fasterxml.jackson.module |
jackson-module-scala_2.12 |
2.15.2 |
com.github.ben-manes.caffeine |
caffeine |
2.9.3 |
com.github.fommil |
jniloader |
1.1 |
com.github.fommil.netlib |
native_ref-java |
1.1 |
com.github.fommil.netlib |
native_ref-java |
1.1-natives |
com.github.fommil.netlib |
native_system-java |
1.1 |
com.github.fommil.netlib |
native_system-java |
1.1-natives |
com.github.fommil.netlib |
netlib-native_ref-linux-x86_64 |
1.1-natives |
com.github.fommil.netlib |
netlib-native_system-linux-x86_64 |
1.1-natives |
com.github.luben |
zstd-jni |
1.5.5-4 |
com.github.wendykierp |
JTransforms |
3.1 |
com.google.code.findbugs |
jsr305 |
3.0.0 |
com.google.code.gson |
gson |
2.10.1 |
com.google.crypto.tink |
tink |
1.9.0 |
com.google.errorprone |
error_prone_annotations |
2.10.0 |
com.google.flatbuffers |
flatbuffers-java |
1.12.0 |
com.google.guava |
guava |
15.0 |
com.google.protobuf |
protobuf-java |
2.6.1 |
com.helger |
profiler |
1.1.1 |
com.jcraft |
jsch |
0.1.55 |
com.jolbox |
bonecp |
0.8.0.RELEASE |
com.lihaoyi |
sourcecode_2.12 |
0.1.9 |
com.microsoft.azure |
azure-data-lake-store-sdk |
2.3.9 |
com.microsoft.sqlserver |
mssql-jdbc |
11.2.2.jre8 |
com.ning |
compress-lzf |
1.1.2 |
com.sun.mail |
javax.mail |
1.5.2 |
com.sun.xml.bind |
jaxb-core |
2.2.11 |
com.sun.xml.bind |
jaxb-impl |
2.2.11 |
com.tdunning |
json |
1.8 |
com.thoughtworks.paranamer |
paranamer |
2.8 |
com.trueaccord.lenses |
lenses_2.12 |
0.4.12 |
com.twitter |
chill-java |
0.10.0 |
com.twitter |
chill_2.12 |
0.10.0 |
com.twitter |
util-app_2.12 |
7.1.0 |
com.twitter |
util-core_2.12 |
7.1.0 |
com.twitter |
util-function_2.12 |
7.1.0 |
com.twitter |
util-jvm_2.12 |
7.1.0 |
com.twitter |
util-lint_2.12 |
7.1.0 |
com.twitter |
util-registry_2.12 |
7.1.0 |
com.twitter |
util-stats_2.12 |
7.1.0 |
com.typesafe |
config |
1.2.1 |
com.typesafe.scala-logging |
scala-logging_2.12 |
3.7.2 |
com.uber |
h3 |
3.7.0 |
com.univocity |
univocity-parsers |
2.9.1 |
com.zaxxer |
HikariCP |
4.0.3 |
commons-cli |
commons-cli |
1.5.0 |
commons-codec |
commons-codec |
1.16.0 |
commons-collections |
commons-collections |
3.2.2 |
commons-dbcp |
commons-dbcp |
1.4 |
commons-fileupload |
commons-fileupload |
1.5 |
commons-httpclient |
commons-httpclient |
3.1 |
commons-io |
commons-io |
2.13.0 |
commons-lang |
commons-lang |
2.6 |
commons-logging |
commons-logging |
1.1.3 |
commons-pool |
commons-pool |
1.5.4 |
dev.ludovic.netlib |
arpack |
3.0.3 |
dev.ludovic.netlib |
blas |
3.0.3 |
dev.ludovic.netlib |
lapack |
3.0.3 |
info.ganglia.gmetric4j |
gmetric4j |
1.0.10 |
io.airlift |
aircompressor |
0.24 |
io.delta |
delta-sharing-spark_2.12 |
0.7.1 |
io.dropwizard.metrics |
metrics-annotation |
4.2.19 |
io.dropwizard.metrics |
metrics-core |
4.2.19 |
io.dropwizard.metrics |
metrics-graphite |
4.2.19 |
io.dropwizard.metrics |
metrics-healthchecks |
4.2.19 |
io.dropwizard.metrics |
metrics-jetty9 |
4.2.19 |
io.dropwizard.metrics |
metrics-jmx |
4.2.19 |
io.dropwizard.metrics |
metrics-json |
4.2.19 |
io.dropwizard.metrics |
metrics-jvm |
4.2.19 |
io.dropwizard.metrics |
metrics-servlets |
4.2.19 |
io.netty |
netty-all |
4.1.93.Final |
io.netty |
netty-buffer |
4.1.93.Final |
io.netty |
netty-codec |
4.1.93.Final |
io.netty |
netty-codec-http |
4.1.93.Final |
io.netty |
netty-codec-http2 |
4.1.93.Final |
io.netty |
netty-codec-socks |
4.1.93.Final |
io.netty |
netty-common |
4.1.93.Final |
io.netty |
netty-handler |
4.1.93.Final |
io.netty |
netty-handler-proxy |
4.1.93.Final |
io.netty |
netty-resolver |
4.1.93.Final |
io.netty |
netty-transport |
4.1.93.Final |
io.netty |
netty-transport-classes-epoll |
4.1.93.Final |
io.netty |
netty-transport-classes-kqueue |
4.1.93.Final |
io.netty |
netty-transport-native-epoll |
4.1.93.Final |
io.netty |
netty-transport-native-epoll |
4.1.93.Final-linux-aarch_64 |
io.netty |
netty-transport-native-epoll |
4.1.93.Final-linux-x86_64 |
io.netty |
netty-transport-native-kqueue |
4.1.93.Final-osx-aarch_64 |
io.netty |
netty-transport-native-kqueue |
4.1.93.Final-osx-x86_64 |
io.netty |
netty-transport-native-unix-common |
4.1.93.Final |
io.prometheus |
simpleclient |
0.7.0 |
io.prometheus |
simpleclient_common |
0.7.0 |
io.prometheus |
simpleclient_dropwizard |
0.7.0 |
io.prometheus |
simpleclient_pushgateway |
0.7.0 |
io.prometheus |
simpleclient_servlet |
0.7.0 |
io.prometheus.jmx |
collector |
0.12.0 |
jakarta.annotation |
jakarta.annotation-api |
1.3.5 |
jakarta.servlet |
jakarta.servlet-api |
4.0.3 |
jakarta.validation |
jakarta.validation-api |
2.0.2 |
jakarta.ws.rs |
jakarta.ws.rs-api |
2.1.6 |
javax.activation |
activation |
1.1.1 |
javax.el |
javax.el-api |
2.2.4 |
javax.jdo |
jdo-api |
3.0.1 |
javax.transaction |
jta |
1.1 |
javax.transaction |
transaction-api |
1.1 |
javax.xml.bind |
jaxb-api |
2.2.11 |
javolution |
javolution |
5.5.1 |
jline |
jline |
2.14.6 |
joda-time |
joda-time |
2.12.1 |
net.java.dev.jna |
jna |
5.8.0 |
net.razorvine |
pickle |
1.3 |
net.sf.jpam |
jpam |
1.1 |
net.sf.opencsv |
opencsv |
2.3 |
net.sf.supercsv |
super-csv |
2.2.0 |
net.snowflake |
snowflake-ingest-sdk |
0.9.6 |
net.snowflake |
snowflake-jdbc |
3.13.29 |
net.sourceforge.f2j |
arpack_combined_all |
0.1 |
org.acplt.remotetea |
remotetea-oncrpc |
1.1.2 |
org.antlr |
ST4 |
4.0.4 |
org.antlr |
antlr-runtime |
3.5.2 |
org.antlr |
antlr4-runtime |
4.9.3 |
org.antlr |
stringtemplate |
3.2.1 |
org.apache.ant |
ant |
1.9.16 |
org.apache.ant |
ant-jsch |
1.9.16 |
org.apache.ant |
ant-launcher |
1.9.16 |
org.apache.arrow |
arrow-format |
12.0.1 |
org.apache.arrow |
arrow-memory-core |
12.0.1 |
org.apache.arrow |
arrow-memory-netty |
12.0.1 |
org.apache.arrow |
arrow-vector |
12.0.1 |
org.apache.avro |
avro |
1.11.2 |
org.apache.avro |
avro-ipc |
1.11.2 |
org.apache.avro |
avro-mapred |
1.11.2 |
org.apache.commons |
commons-collections4 |
4.4 |
org.apache.commons |
commons-compress |
1.23.0 |
org.apache.commons |
commons-crypto |
1.1.0 |
org.apache.commons |
commons-lang3 |
3.12.0 |
org.apache.commons |
commons-math3 |
3.6.1 |
org.apache.commons |
commons-text |
1.10.0 |
org.apache.curator |
curator-client |
2.13.0 |
org.apache.curator |
curator-framework |
2.13.0 |
org.apache.curator |
curator-recipes |
2.13.0 |
org.apache.datasketches |
datasketches-java |
3.1.0 |
org.apache.datasketches |
datasketches-memory |
2.0.0 |
org.apache.derby |
derby |
10.14.2.0 |
org.apache.hadoop |
hadoop-client-runtime |
3.3.6 |
org.apache.hive |
hive-beeline |
2.3.9 |
org.apache.hive |
hive-cli |
2.3.9 |
org.apache.hive |
hive-jdbc |
2.3.9 |
org.apache.hive |
hive-llap-client |
2.3.9 |
org.apache.hive |
hive-llap-common |
2.3.9 |
org.apache.hive |
hive-serde |
2.3.9 |
org.apache.hive |
hive-shims |
2.3.9 |
org.apache.hive |
hive-storage-api |
2.8.1 |
org.apache.hive.shims |
hive-shims-0.23 |
2.3.9 |
org.apache.hive.shims |
hive-shims-common |
2.3.9 |
org.apache.hive.shims |
hive-shims-scheduler |
2.3.9 |
org.apache.httpcomponents |
httpclient |
4.5.14 |
org.apache.httpcomponents |
httpcore |
4.4.16 |
org.apache.ivy |
ivy |
2.5.1 |
org.apache.logging.log4j |
log4j-1.2-api |
2.20.0 |
org.apache.logging.log4j |
log4j-api |
2.20.0 |
org.apache.logging.log4j |
log4j-core |
2.20.0 |
org.apache.logging.log4j |
log4j-slf4j2-impl |
2.20.0 |
org.apache.mesos |
mesos |
1.11.0-shaded-protobuf |
org.apache.orc |
orc-core |
1.9.0-shaded-protobuf |
org.apache.orc |
orc-mapreduce |
1.9.0-shaded-protobuf |
org.apache.orc |
orc-shims |
1.9.0 |
org.apache.thrift |
libfb303 |
0.9.3 |
org.apache.thrift |
libthrift |
0.12.0 |
org.apache.xbean |
xbean-asm9-shaded |
4.23 |
org.apache.yetus |
audience-annotations |
0.13.0 |
org.apache.zookeeper |
zookeeper |
3.6.3 |
org.apache.zookeeper |
zookeeper-jute |
3.6.3 |
org.checkerframework |
checker-qual |
3.31.0 |
org.codehaus.jackson |
jackson-core-asl |
1.9.13 |
org.codehaus.jackson |
jackson-mapper-asl |
1.9.13 |
org.codehaus.janino |
commons-compiler |
3.0.16 |
org.codehaus.janino |
janino |
3.0.16 |
org.datanucleus |
datanucleus-api-jdo |
4.2.4 |
org.datanucleus |
datanucleus-core |
4.1.17 |
org.datanucleus |
datanucleus-rdbms |
4.1.19 |
org.datanucleus |
javax.jdo |
3.2.0-m3 |
org.eclipse.jetty |
jetty-client |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-continuation |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-http |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-io |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-jndi |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-plus |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-proxy |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-security |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-server |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-servlet |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-servlets |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-util |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-util-ajax |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-webapp |
9.4.51.v20230217 |
org.eclipse.jetty |
jetty-xml |
9.4.51.v20230217 |
org.eclipse.jetty.websocket |
websocket-api |
9.4.51.v20230217 |
org.eclipse.jetty.websocket |
websocket-client |
9.4.51.v20230217 |
org.eclipse.jetty.websocket |
websocket-common |
9.4.51.v20230217 |
org.eclipse.jetty.websocket |
websocket-server |
9.4.51.v20230217 |
org.eclipse.jetty.websocket |
websocket-servlet |
9.4.51.v20230217 |
org.fusesource.leveldbjni |
leveldbjni-all |
1.8 |
org.glassfish.hk2 |
hk2-api |
2.6.1 |
org.glassfish.hk2 |
hk2-locator |
2.6.1 |
org.glassfish.hk2 |
hk2-utils |
2.6.1 |
org.glassfish.hk2 |
osgi-resource-locator |
1.0.3 |
org.glassfish.hk2.external |
aopalliance-repackaged |
2.6.1 |
org.glassfish.hk2.external |
jakarta.inject |
2.6.1 |
org.glassfish.jersey.containers |
jersey-container-servlet |
2.40 |
org.glassfish.jersey.containers |
jersey-container-servlet-core |
2.40 |
org.glassfish.jersey.core |
jersey-client |
2.40 |
org.glassfish.jersey.core |
jersey-common |
2.40 |
org.glassfish.jersey.core |
jersey-server |
2.40 |
org.glassfish.jersey.inject |
jersey-hk2 |
2.40 |
org.hibernate.validator |
hibernate-validator |
6.1.7.Final |
org.ini4j |
ini4j |
0.5.4 |
org.javassist |
javassist |
3.29.2-GA |
org.jboss.logging |
jboss-logging |
3.3.2.Final |
org.jdbi |
jdbi |
2.63.1 |
org.jetbrains |
annotations |
17.0.0 |
org.joda |
joda-convert |
1.7 |
org.jodd |
jodd-core |
3.5.2 |
org.json4s |
json4s-ast_2.12 |
3.7.0-M11 |
org.json4s |
json4s-core_2.12 |
3.7.0-M11 |
org.json4s |
json4s-jackson_2.12 |
3.7.0-M11 |
org.json4s |
json4s-scalap_2.12 |
3.7.0-M11 |
org.lz4 |
lz4-java |
1.8.0 |
org.mariadb.jdbc |
mariadb-java-client |
2.7.9 |
org.mlflow |
mlflow-spark |
2.2.0 |
org.objenesis |
objenesis |
2.5.1 |
org.postgresql |
postgresql |
42.6.0 |
org.roaringbitmap |
RoaringBitmap |
0.9.45 |
org.roaringbitmap |
shims |
0.9.45 |
org.rocksdb |
rocksdbjni |
8.3.2 |
org.rosuda.REngine |
REngine |
2.1.0 |
org.scala-lang |
scala-compiler_2.12 |
2.12.15 |
org.scala-lang |
scala-library_2.12 |
2.12.15 |
org.scala-lang |
scala-reflect_2.12 |
2.12.15 |
org.scala-lang.modules |
scala-collection-compat_2.12 |
2.9.0 |
org.scala-lang.modules |
scala-parser-combinators_2.12 |
1.1.2 |
org.scala-lang.modules |
scala-xml_2.12 |
1.2.0 |
org.scala-sbt |
test-interface |
1.0 |
org.scalacheck |
scalacheck_2.12 |
1.14.2 |
org.scalactic |
scalactic_2.12 |
3.2.15 |
org.scalanlp |
breeze-macros_2.12 |
2.1.0 |
org.scalanlp |
breeze_2.12 |
2.1.0 |
org.scalatest |
scalatest-compatible |
3.2.15 |
org.scalatest |
scalatest-core_2.12 |
3.2.15 |
org.scalatest |
scalatest-diagrams_2.12 |
3.2.15 |
org.scalatest |
scalatest-featurespec_2.12 |
3.2.15 |
org.scalatest |
scalatest-flatspec_2.12 |
3.2.15 |
org.scalatest |
scalatest-freespec_2.12 |
3.2.15 |
org.scalatest |
scalatest-funspec_2.12 |
3.2.15 |
org.scalatest |
scalatest-funsuite_2.12 |
3.2.15 |
org.scalatest |
scalatest-matchers-core_2.12 |
3.2.15 |
org.scalatest |
scalatest-mustmatchers_2.12 |
3.2.15 |
org.scalatest |
scalatest-propspec_2.12 |
3.2.15 |
org.scalatest |
scalatest-refspec_2.12 |
3.2.15 |
org.scalatest |
scalatest-shouldmatchers_2.12 |
3.2.15 |
org.scalatest |
scalatest-wordspec_2.12 |
3.2.15 |
org.scalatest |
scalatest_2.12 |
3.2.15 |
org.slf4j |
jcl-over-slf4j |
2.0.7 |
org.slf4j |
jul-to-slf4j |
2.0.7 |
org.slf4j |
slf4j-api |
2.0.7 |
org.threeten |
threeten-extra |
1.7.1 |
org.tukaani |
xz |
1.9 |
org.typelevel |
algebra_2.12 |
2.0.1 |
org.typelevel |
cats-kernel_2.12 |
2.1.1 |
org.typelevel |
spire-macros_2.12 |
0.17.0 |
org.typelevel |
spire-platform_2.12 |
0.17.0 |
org.typelevel |
spire-util_2.12 |
0.17.0 |
org.typelevel |
spire_2.12 |
0.17.0 |
org.wildfly.openssl |
wildfly-openssl |
1.1.3.Final |
org.xerial |
sqlite-jdbc |
3.42.0.0 |
org.xerial.snappy |
snappy-java |
1.1.10.3 |
org.yaml |
snakeyaml |
2.0 |
oro |
oro |
2.0.8 |
pl.edu.icm |
JLargeArrays |
1.5 |
software.amazon.cryptools |
AmazonCorrettoCryptoProvider |
1.6.1-linux-x86_64 |
software.amazon.ion |
ion-java |
1.0.2 |
stax |
stax-api |
1.0.1 |