Databricks Runtime 12.1

The following release notes provide information about Databricks Runtime 12.1, powered by Apache Spark 3.3.1.

Databricks released these images in January 2023.

New features and improvements

Support for protocol buffers is in Public Preview

You can use the from_protobuf and to_protobuf functions to exchange data between binary and struct types. See Read and write protocol buffers.

Support for Confluent Schema Registry authentication

Databricks integration with Confluent Schema Registry now supports external schema registry addresses with authentication. This feature is available for from_avro, to_avro, from_protobuf, and to_protobuf functions. See Protobuf or Avro.

Support for Delta Sharing Streaming

Spark Structured Streaming now works with the format deltasharing on a source Delta Sharing table.

Table version using timestamp now supported for Delta Sharing tables in catalogs

You can now use the SQL syntax TIMESTAMP AS OF in SELECT statements to specify the version of a Delta Sharing table that’s mounted in a catalog.

Support for WHEN NOT MATCHED BY SOURCE for MERGE INTO

You can now add WHEN NOT MATCHED BY SOURCE clauses to MERGE INTO to update or delete rows in the chosen table that don’t have matches in the source table based on the merge condition. The new clause is available in SQL, Python, Scala, and Java. See MERGE INTO.

Optimized statistics collection for CONVERT TO DELTA

Statistics collection for the CONVERT TO DELTA operation is now much faster. This reduces the number of workloads that might use NO STATISTICS for efficiency.

Library upgrades

  • Upgraded Python libraries:

    • filelock from 3.8.0 to 3.8.2

    • platformdirs from 2.5.4 to 2.6.0

    • setuptools from 58.0.4 to 61.2.0

  • Upgraded R libraries:

  • Upgraded Java libraries:

    • io.delta.delta-sharing-spark_2.12 from 0.5.2 to 0.6.2

    • org.apache.hive.hive-storage-api from 2.7.2 to 2.8.1

    • org.apache.parquet.parquet-column from 1.12.3-databricks-0001 to 1.12.3-databricks-0002

    • org.apache.parquet.parquet-common from 1.12.3-databricks-0001 to 1.12.3-databricks-0002

    • org.apache.parquet.parquet-encoding from 1.12.3-databricks-0001 to 1.12.3-databricks-0002

    • org.apache.parquet.parquet-format-structures from 1.12.3-databricks-0001 to 1.12.3-databricks-0002

    • org.apache.parquet.parquet-hadoop from 1.12.3-databricks-0001 to 1.12.3-databricks-0002

    • org.apache.parquet.parquet-jackson from 1.12.3-databricks-0001 to 1.12.3-databricks-0002

    • org.tukaani.xz from 1.8 to 1.9

Apache Spark

Databricks Runtime 12.1 includes Apache Spark 3.3.1. This release includes all Spark fixes and improvements included in Databricks Runtime 12.0, as well as the following additional bug fixes and improvements made to Spark:

  • [SPARK-41405] [SC-119769][12.1.0] Revert “[SC-119411][SQL] Centralize the column resolution logic” and “[SC-117170][SPARK-41338][SQL] Resolve outer references and normal columns in the same analyzer batch”

  • [SPARK-41405] [SC-119411][SQL] Centralize the column resolution logic

  • [SPARK-41859] [SC-119514][SQL] CreateHiveTableAsSelectCommand should set the overwrite flag correctly

  • [SPARK-41659] [SC-119526][CONNECT][12.X] Enable doctests in pyspark.sql.connect.readwriter

  • [SPARK-41858] [SC-119427][SQL] Fix ORC reader perf regression due to DEFAULT value feature

  • [SPARK-41807] [SC-119399][CORE] Remove non-existent error class: UNSUPPORTED_FEATURE.DISTRIBUTE_BY

  • [SPARK-41578] [12.x][SC-119273][SQL] Assign name to LEGACYERROR_TEMP_2141

  • [SPARK-41571] [SC-119362][SQL] Assign name to LEGACYERROR_TEMP_2310

  • [SPARK-41810] [SC-119373][CONNECT] Infer names from a list of dictionaries in SparkSession.createDataFrame

  • [SPARK-40993] [SC-119504][SPARK-41705][CONNECT][12.X] Move Spark Connect documentation and script to dev/ and Python documentation

  • [SPARK-41534] [SC-119456][CONNECT][SQL][12.x] Setup initial client module for Spark Connect

  • [SPARK-41365] [SC-118498][UI][3.3] Stages UI page fails to load for proxy in specific yarn environment

  • [SPARK-41481] [SC-118150][CORE][SQL] Reuse INVALID_TYPED_LITERAL instead of _LEGACY_ERROR_TEMP_0020

  • [SPARK-41049] [SC-119305][SQL] Revisit stateful expression handling

  • [SPARK-41726] [SC-119248][SQL] Remove OptimizedCreateHiveTableAsSelectCommand

  • [SPARK-41271] [SC-118648][SC-118348][SQL] Support parameterized SQL queries by sql()

  • [SPARK-41066] [SC-119344][CONNECT][PYTHON] Implement DataFrame.sampleBy and DataFrame.stat.sampleBy

  • [SPARK-41407] [SC-119402][SC-119012][SQL][ALL TESTS] Pull out v1 write to WriteFiles

  • [SPARK-41565] [SC-118868][SQL] Add the error class UNRESOLVED_ROUTINE

  • [SPARK-41668] [SC-118925][SQL] DECODE function returns wrong results when passed NULL

  • [SPARK-41554] [SC-119274] fix changing of Decimal scale when scale decreased by m…

  • [SPARK-41065] [SC-119324][CONNECT][PYTHON] Implement DataFrame.freqItems and DataFrame.stat.freqItems

  • [SPARK-41742] [SC-119404][SPARK-41745][CONNECT][12.X] Reenable doc tests and add missing column alias to count()

  • [SPARK-41069] [SC-119310][CONNECT][PYTHON] Implement DataFrame.approxQuantile and DataFrame.stat.approxQuantile

  • [SPARK-41809] [SC-119367][CONNECT][PYTHON] Make function from_json support DataType Schema

  • [SPARK-41804] [SC-119382][SQL] Choose correct element size in InterpretedUnsafeProjection for array of UDTs

  • [SPARK-41786] [SC-119308][CONNECT][PYTHON] Deduplicate helper functions

  • [SPARK-41745] [SC-119378][SPARK-41789][12.X] Make createDataFrame support list of Rows

  • [SPARK-41344] [SC-119217][SQL] Make error clearer when table not found in SupportsCatalogOptions catalog

  • [SPARK-41803] [SC-119380][CONNECT][PYTHON] Add missing function log(arg1, arg2)

  • [SPARK-41808] [SC-119356][CONNECT][PYTHON] Make JSON functions support options

  • [SPARK-41779] [SC-119275][SPARK-41771][CONNECT][PYTHON] Make __getitem__ support filter and select

  • [SPARK-41783] [SC-119288][SPARK-41770][CONNECT][PYTHON] Make column op support None

  • [SPARK-41440] [SC-119279][CONNECT][PYTHON] Avoid the cache operator for general Sample.

  • [SPARK-41785] [SC-119290][CONNECT][PYTHON] Implement GroupedData.mean

  • [SPARK-41629] [SC-119276][CONNECT] Support for Protocol Extensions in Relation and Expression

  • [SPARK-41417] [SC-118000][CORE][SQL] Rename _LEGACY_ERROR_TEMP_0019 to INVALID_TYPED_LITERAL

  • [SPARK-41533] [SC-119342][CONNECT][12.X] Proper Error Handling for Spark Connect Server / Client

  • [SPARK-41292] [SC-119357][CONNECT][12.X] Support Window in pyspark.sql.window namespace

  • [SPARK-41493] [SC-119339][CONNECT][PYTHON] Make csv functions support options

  • [SPARK-39591] [SC-118675][SS] Async Progress Tracking

  • [SPARK-41767] [SC-119337][CONNECT][PYTHON][12.X] Implement Column.{withField, dropFields}

  • [SPARK-41068] [SC-119268][CONNECT][PYTHON] Implement DataFrame.stat.corr

  • [SPARK-41655] [SC-119323][CONNECT][12.X] Enable doctests in pyspark.sql.connect.column

  • [SPARK-41738] [SC-119170][CONNECT] Mix ClientId in SparkSession cache

  • [SPARK-41354] [SC-119194][CONNECT] Add RepartitionByExpression to proto

  • [SPARK-41784] [SC-119289][CONNECT][PYTHON] Add missing __rmod__ in Column

  • [SPARK-41778] [SC-119262][SQL] Add an alias “reduce” to ArrayAggregate

  • [SPARK-41067] [SC-119171][CONNECT][PYTHON] Implement DataFrame.stat.cov

  • [SPARK-41764] [SC-119216][CONNECT][PYTHON] Make the internal string op name consistent with FunctionRegistry

  • [SPARK-41734] [SC-119160][CONNECT] Add a parent message for Catalog

  • [SPARK-41742] [SC-119263] Support df.groupBy().agg({“*”:”count”})

  • [SPARK-41761] [SC-119213][CONNECT][PYTHON] Fix arithmetic ops: __neg__, __pow__, __rpow__

  • [SPARK-41062] [SC-118182][SQL] Rename UNSUPPORTED_CORRELATED_REFERENCE to CORRELATED_REFERENCE

  • [SPARK-41751] [SC-119211][CONNECT][PYTHON] Fix Column.{isNull, isNotNull, eqNullSafe}

  • [SPARK-41728] [SC-119164][CONNECT][PYTHON][12.X] Implement unwrap_udt function

  • [SPARK-41333] [SC-119195][SPARK-41737] Implement GroupedData.{min, max, avg, sum}

  • [SPARK-41751] [SC-119206][CONNECT][PYTHON] Fix Column.{bitwiseAND, bitwiseOR, bitwiseXOR}

  • [SPARK-41631] [SC-101081][SQL] Support implicit lateral column alias resolution on Aggregate

  • [SPARK-41529] [SC-119207][CONNECT][12.X] Implement SparkSession.stop

  • [SPARK-41729] [SC-119205][CORE][SQL][12.X] Rename _LEGACY_ERROR_TEMP_0011 to UNSUPPORTED_FEATURE.COMBINATION_QUERY_RESULT_CLAUSES

  • [SPARK-41717] [SC-119078][CONNECT][12.X] Deduplicate print and reprhtml_ at LogicalPlan

  • [SPARK-41740] [SC-119169][CONNECT][PYTHON] Implement Column.name

  • [SPARK-41733] [SC-119163][SQL][SS] Apply tree-pattern based pruning for the rule ResolveWindowTime

  • [SPARK-41732] [SC-119157][SQL][SS] Apply tree-pattern based pruning for the rule SessionWindowing

  • [SPARK-41498] [SC-119018] Propagate metadata through Union

  • [SPARK-41731] [SC-119166][CONNECT][PYTHON][12.X] Implement the column accessor

  • [SPARK-41736] [SC-119161][CONNECT][PYTHON] pyspark_types_to_proto_types should supports ArrayType

  • [SPARK-41473] [SC-119092][CONNECT][PYTHON] Implement format_number function

  • [SPARK-41707] [SC-119141][CONNECT][12.X] Implement Catalog API in Spark Connect

  • [SPARK-41710] [SC-119062][CONNECT][PYTHON] Implement Column.between

  • [SPARK-41235] [SC-119088][SQL][PYTHON]High-order function: array_compact implementation

  • [SPARK-41518] [SC-118453][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2422

  • [SPARK-41723] [SC-119091][CONNECT][PYTHON] Implement sequence function

  • [SPARK-41703] [SC-119060][CONNECT][PYTHON] Combine NullType and typed_null in Literal

  • [SPARK-41722] [SC-119090][CONNECT][PYTHON] Implement 3 missing time window functions

  • [SPARK-41503] [SC-119043][CONNECT][PYTHON] Implement Partition Transformation Functions

  • [SPARK-41413] [SC-118968][SQL] Avoid shuffle in Storage-Partitioned Join when partition keys mismatch, but join expressions are compatible

  • [SPARK-41700] [SC-119046][CONNECT][PYTHON] Remove FunctionBuilder

  • [SPARK-41706] [SC-119094][CONNECT][PYTHON] pyspark_types_to_proto_types should supports MapType

  • [SPARK-41702] [SC-119049][CONNECT][PYTHON] Add invalid column ops

  • [SPARK-41660] [SC-118866][SQL] Only propagate metadata columns if they are used

  • [SPARK-41637] [SC-119003][SQL] ORDER BY ALL

  • [SPARK-41513] [SC-118945][SQL] Implement an accumulator to collect per mapper row count metrics

  • [SPARK-41647] [SC-119064][CONNECT][12.X] Deduplicate docstrings in pyspark.sql.connect.functions

  • [SPARK-41701] [SC-119048][CONNECT][PYTHON] Make column op support decimal

  • [SPARK-41383] [SC-119015][SPARK-41692][SPARK-41693] Implement rollup, cube and pivot

  • [SPARK-41635] [SC-118944][SQL] GROUP BY ALL

  • [SPARK-41645] [SC-119057][CONNECT][12.X] Deduplicate docstrings in pyspark.sql.connect.dataframe

  • [SPARK-41688] [SC-118951][CONNECT][PYTHON] Move Expressions to expressions.py

  • [SPARK-41687] [SC-118949][CONNECT] Deduplicate docstrings in pyspark.sql.connect.group

  • [SPARK-41649] [SC-118950][CONNECT] Deduplicate docstrings in pyspark.sql.connect.window

  • [SPARK-41681] [SC-118939][CONNECT] Factor GroupedData out to group.py

  • [SPARK-41292] [SC-119038][SPARK-41640][SPARK-41641][CONNECT][PYTHON][12.X] Implement Window functions

  • [SPARK-41675] [SC-119031][SC-118934][CONNECT][PYTHON][12.X] Make Column op support datetime

  • [SPARK-41672] [SC-118929][CONNECT][PYTHON] Enable the deprecated functions

  • [SPARK-41673] [SC-118932][CONNECT][PYTHON] Implement Column.astype

  • [SPARK-41364] [SC-118865][CONNECT][PYTHON] Implement broadcast function

  • [SPARK-41648] [SC-118914][CONNECT][12.X] Deduplicate docstrings in pyspark.sql.connect.readwriter

  • [SPARK-41646] [SC-118915][CONNECT][12.X] Deduplicate docstrings in pyspark.sql.connect.session

  • [SPARK-41643] [SC-118862][CONNECT][12.X] Deduplicate docstrings in pyspark.sql.connect.column

  • [SPARK-41663] [SC-118936][CONNECT][PYTHON][12.X] Implement the rest of Lambda functions

  • [SPARK-41441] [SC-118557][SQL] Support Generate with no required child output to host outer references

  • [SPARK-41669] [SC-118923][SQL] Early pruning in canCollapseExpressions

  • [SPARK-41639] [SC-118927][SQL][PROTOBUF] : Remove ScalaReflectionLock from SchemaConverters

  • [SPARK-41464] [SC-118861][CONNECT][PYTHON] Implement DataFrame.to

  • [SPARK-41434] [SC-118857][CONNECT][PYTHON] Initial LambdaFunction implementation

  • [SPARK-41539] [SC-118802][SQL] Remap stats and constraints against output in logical plan for LogicalRDD

  • [SPARK-41396] [SC-118786][SQL][PROTOBUF] OneOf field support and recursion checks

  • [SPARK-41528] [SC-118769][CONNECT][12.X] Merge namespace of Spark Connect and PySpark API

  • [SPARK-41568] [SC-118715][SQL] Assign name to LEGACYERROR_TEMP_1236

  • [SPARK-41440] [SC-118788][CONNECT][PYTHON] Implement DataFrame.randomSplit

  • [SPARK-41583] [SC-118718][SC-118642][CONNECT][PROTOBUF] Add Spark Connect and protobuf into setup.py with specifying dependencies

  • [SPARK-27561] [SC-101081][12.x][SQL] Support implicit lateral column alias resolution on Project

  • [SPARK-41535] [SC-118645][SQL] Set null correctly for calendar interval fields in InterpretedUnsafeProjection and InterpretedMutableProjection

  • [SPARK-40687] [SC-118439][SQL] Support data masking built-in function ‘mask’

  • [SPARK-41520] [SC-118440][SQL] Split AND_OR TreePattern to separate AND and OR TreePatterns

  • [SPARK-41349] [SC-118668][CONNECT][PYTHON] Implement DataFrame.hint

  • [SPARK-41546] [SC-118541][CONNECT][PYTHON] pyspark_types_to_proto_types should support StructType.

  • [SPARK-41334] [SC-118549][CONNECT][PYTHON] Move SortOrder proto from relations to expressions

  • [SPARK-41387] [SC-118450][SS] Assert current end offset from Kafka data source for Trigger.AvailableNow

  • [SPARK-41508] [SC-118445][CORE][SQL] Rename _LEGACY_ERROR_TEMP_1180 to UNEXPECTED_INPUT_TYPE and remove _LEGACY_ERROR_TEMP_1179

  • [SPARK-41319] [SC-118441][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function when with UnresolvedFunction

  • [SPARK-41541] [SC-118460][SQL] Fix call to wrong child method in SQLShuffleWriteMetricsReporter.decRecordsWritten()

  • [SPARK-41453] [SC-118458][CONNECT][PYTHON] Implement DataFrame.subtract

  • [SPARK-41248] [SC-118436][SC-118303][SQL] Add “spark.sql.json.enablePartialResults” to enable/disable JSON partial results

  • [SPARK-41437] Revert “[SC-117601][SQL] Do not optimize the inputquery twice for v1 write fallback”

  • [SPARK-41472] [SC-118352][CONNECT][PYTHON] Implement the rest of string/binary functions

  • [SPARK-41526] [SC-118355][CONNECT][PYTHON] Implement Column.isin

  • [SPARK-32170] [SC-118384] [CORE] Improve the speculation through the stage task metrics.

  • [SPARK-41524] [SC-118399][SS] Differentiate SQLConf and extraOptions in StateStoreConf for its usage in RocksDBConf

  • [SPARK-41465] [SC-118381][SQL] Assign a name to the error class LEGACYERROR_TEMP_1235

  • [SPARK-41511] [SC-118365][SQL] LongToUnsafeRowMap support ignoresDuplicatedKey

  • [SPARK-41409] [SC-118302][CORE][SQL] Rename _LEGACY_ERROR_TEMP_1043 to WRONG_NUM_ARGS.WITHOUT_SUGGESTION

  • [SPARK-41438] [SC-118344][CONNECT][PYTHON] Implement DataFrame.colRegex

  • [SPARK-41437] [SC-117601][SQL] Do not optimize the input query twice for v1 write fallback

  • [SPARK-41314] [SC-117172][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1094

  • [SPARK-41443] [SC-118004][SQL] Assign a name to the error class LEGACYERROR_TEMP_1061

  • [SPARK-41506] [SC-118241][CONNECT][PYTHON] Refactor LiteralExpression to support DataType

  • [SPARK-41448] [SC-118046] Make consistent MR job IDs in FileBatchWriter and FileFormatWriter

  • [SPARK-41456] [SC-117970][SQL] Improve the performance of try_cast

  • [SPARK-41495] [SC-118125][CONNECT][PYTHON] Implement collection functions: P~Z

  • [SPARK-41478] [SC-118167][SQL] Assign a name to the error class LEGACYERROR_TEMP_1234

  • [SPARK-41406] [SC-118161][SQL] Refactor error message for NUM_COLUMNS_MISMATCH to make it more generic

  • [SPARK-41404] [SC-118016][SQL] Refactor ColumnVectorUtils#toBatch to make ColumnarBatchSuite#testRandomRows test more primitive dataType

  • [SPARK-41468] [SC-118044][SQL] Fix PlanExpression handling in EquivalentExpressions

  • [SPARK-40775] [SC-118045][SQL] Fix duplicate description entries for V2 file scans

  • [SPARK-41492] [SC-118042][CONNECT][PYTHON] Implement MISC functions

  • [SPARK-41459] [SC-118005][SQL] fix thrift server operation log output is empty

  • [SPARK-41395] [SC-117899][SQL] InterpretedMutableProjection should use setDecimal to set null values for decimals in an unsafe row

  • [SPARK-41376] [SC-117840][CORE][3.3] Correct the Netty preferDirectBufs check logic on executor start

  • [SPARK-41484] [SC-118159][SC-118036][CONNECT][PYTHON][12.x] Implement collection functions: E~M

  • [SPARK-41389] [SC-117426][CORE][SQL] Reuse WRONG_NUM_ARGS instead of _LEGACY_ERROR_TEMP_1044

  • [SPARK-41462] [SC-117920][SQL] Date and timestamp type can up cast to TimestampNTZ

  • [SPARK-41435] [SC-117810][SQL] Change to call invalidFunctionArgumentsError for curdate() when expressions is not empty

  • [SPARK-41187] [SC-118030][CORE] LiveExecutor MemoryLeak in AppStatusListener when ExecutorLost happen

  • [SPARK-41360] [SC-118083][CORE] Avoid BlockManager re-registration if the executor has been lost

  • [SPARK-41378] [SC-117686][SQL] Support Column Stats in DS v2

  • [SPARK-41402] [SC-117910][SQL][CONNECT][12.X] Override prettyName of StringDecode

  • [SPARK-41414] [SC-118041][CONNECT][PYTHON][12.x] Implement date/timestamp functions

  • [SPARK-41329] [SC-117975][CONNECT] Resolve circular imports in Spark Connect

  • [SPARK-41477] [SC-118025][CONNECT][PYTHON] Correctly infer the datatype of literal integers

  • [SPARK-41446] [SC-118024][CONNECT][PYTHON][12.x] Make createDataFrame support schema and more input dataset types

  • [SPARK-41475] [SC-117997][CONNECT] Fix lint-scala command error and typo

  • [SPARK-38277] [SC-117799][SS] Clear write batch after RocksDB state store’s commit

  • [SPARK-41375] [SC-117801][SS] Avoid empty latest KafkaSourceOffset

  • [SPARK-41412] [SC-118015][CONNECT] Implement Column.cast

  • [SPARK-41439] [SC-117893][CONNECT][PYTHON] Implement DataFrame.melt and DataFrame.unpivot

  • [SPARK-41399] [SC-118007][SC-117474][CONNECT] Refactor column related tests to test_connect_column

  • [SPARK-41351] [SC-117957][SC-117412][CONNECT][12.x] Column should support != operator

  • [SPARK-40697] [SC-117806][SC-112787][SQL] Add read-side char padding to cover external data files

  • [SPARK-41349] [SC-117594][CONNECT][12.X] Implement DataFrame.hint

  • [SPARK-41338] [SC-117170][SQL] Resolve outer references and normal columns in the same analyzer batch

  • [SPARK-41436] [SC-117805][CONNECT][PYTHON] Implement collection functions: A~C

  • [SPARK-41445] [SC-117802][CONNECT] Implement DataFrameReader.parquet

  • [SPARK-41452] [SC-117865][SQL] to_char should return null when format is null

  • [SPARK-41444] [SC-117796][CONNECT] Support read.json()

  • [SPARK-41398] [SC-117508][SQL] Relax constraints on Storage-Partitioned Join when partition keys after runtime filtering do not match

  • [SPARK-41228] [SC-117169][SQL] Rename & Improve error message for COLUMN_NOT_IN_GROUP_BY_CLAUSE.

  • [SPARK-41381] [SC-117593][CONNECT][PYTHON] Implement count_distinct and sum_distinct functions

  • [SPARK-41433] [SC-117596][CONNECT] Make Max Arrow BatchSize configurable

  • [SPARK-41397] [SC-117590][CONNECT][PYTHON] Implement part of string/binary functions

  • [SPARK-41382] [SC-117588][CONNECT][PYTHON] Implement product function

  • [SPARK-41403] [SC-117595][CONNECT][PYTHON] Implement DataFrame.describe

  • [SPARK-41366] [SC-117580][CONNECT] DF.groupby.agg() should be compatible

  • [SPARK-41369] [SC-117584][CONNECT] Add connect common to servers’ shaded jar

  • [SPARK-41411] [SC-117562][SS] Multi-Stateful Operator watermark support bug fix

  • [SPARK-41176] [SC-116630][SQL] Assign a name to the error class LEGACYERROR_TEMP_1042

  • [SPARK-41380] [SC-117476][CONNECT][PYTHON][12.X] Implement aggregation functions

  • [SPARK-41363] [SC-117470][CONNECT][PYTHON][12.X] Implement normal functions

  • [SPARK-41305] [SC-117411][CONNECT] Improve Documentation for Command proto

  • [SPARK-41372] [SC-117427][CONNECT][PYTHON] Implement DataFrame TempView

  • [SPARK-41379] [SC-117420][SS][PYTHON] Provide cloned spark session in DataFrame in user function for foreachBatch sink in PySpark

  • [SPARK-41373] [SC-117405][SQL][ERROR] Rename CAST_WITH_FUN_SUGGESTION to CAST_WITH_FUNC_SUGGESTION

  • [SPARK-41358] [SC-117417][SQL] Refactor ColumnVectorUtils#populate method to use PhysicalDataType instead of DataType

  • [SPARK-41355] [SC-117423][SQL] Workaround hive table name validation issue

  • [SPARK-41390] [SC-117429][SQL] Update the script used to generate register function in UDFRegistration

  • [SPARK-41206] [SC-117233][SC-116381][SQL] Rename the error class _LEGACY_ERROR_TEMP_1233 to COLUMN_ALREADY_EXISTS

  • [SPARK-41357] [SC-117310][CONNECT][PYTHON][12.X] Implement math functions

  • [SPARK-40970] [SC-117308][CONNECT][PYTHON] Support List[Column] for Join’s on argument

  • [SPARK-41345] [SC-117178][CONNECT] Add Hint to Connect Proto

  • [SPARK-41226] [SC-117194][SQL][12.x] Refactor Spark types by introducing physical types

  • [SPARK-41317] [SC-116902][CONNECT][PYTHON][12.X] Add basic support for DataFrameWriter

  • [SPARK-41347] [SC-117173][CONNECT] Add Cast to Expression proto

  • [SPARK-41323] [SC-117128][SQL] Support current_schema

  • [SPARK-41339] [SC-117171][SQL] Close and recreate RocksDB write batch instead of just clearing

  • [SPARK-41227] [SC-117165][CONNECT][PYTHON] Implement DataFrame cross join

  • [SPARK-41346] [SC-117176][CONNECT][PYTHON] Implement asc and desc functions

  • [SPARK-41343] [SC-117166][CONNECT] Move FunctionName parsing to server side

  • [SPARK-41321] [SC-117163][CONNECT] Support target field for UnresolvedStar

  • [SPARK-41237] [SC-117167][SQL] Reuse the error class UNSUPPORTED_DATATYPE for _LEGACY_ERROR_TEMP_0030

  • [SPARK-41309] [SC-116916][SQL] Reuse INVALID_SCHEMA.NON_STRING_LITERAL instead of _LEGACY_ERROR_TEMP_1093

  • [SPARK-41276] [SC-117136][SQL][ML][MLLIB][PROTOBUF][PYTHON][R][SS][AVRO] Optimize constructor use of StructType

  • [SPARK-41335] [SC-117135][CONNECT][PYTHON] Support IsNull and IsNotNull in Column

  • [SPARK-41332] [SC-117131][CONNECT][PYTHON] Fix nullOrdering in SortOrder

  • [SPARK-41325] [SC-117132][CONNECT][12.X] Fix missing avg() for GroupBy on DF

  • [SPARK-41327] [SC-117137][CORE] Fix SparkStatusTracker.getExecutorInfos by switch On/OffHeapStorageMemory info

  • [SPARK-41315] [SC-117129][CONNECT][PYTHON] Implement DataFrame.replace and DataFrame.na.replace

  • [SPARK-41328] [SC-117125][CONNECT][PYTHON] Add logical and string API to Column

  • [SPARK-41331] [SC-117127][CONNECT][PYTHON] Add orderBy and drop_duplicates

  • [SPARK-40987] [SC-117124][CORE] BlockManager#removeBlockInternal should ensure the lock is unlocked gracefully

  • [SPARK-41268] [SC-117102][SC-116970][CONNECT][PYTHON] Refactor “Column” for API Compatibility

  • [SPARK-41312] [SC-116881][CONNECT][PYTHON][12.X] Implement DataFrame.withColumnRenamed

  • [SPARK-41221] [SC-116607][SQL] Add the error class INVALID_FORMAT

  • [SPARK-41272] [SC-116742][SQL] Assign a name to the error class LEGACYERROR_TEMP_2019

  • [SPARK-41180] [SC-116760][SQL] Reuse INVALID_SCHEMA instead of _LEGACY_ERROR_TEMP_1227

  • [SPARK-41260] [SC-116880][PYTHON][SS][12.X] Cast NumPy instances to Python primitive types in GroupState update

  • [SPARK-41174] [SC-116609][CORE][SQL] Propagate an error class to users for invalid format of to_binary()

  • [SPARK-41264] [SC-116971][CONNECT][PYTHON] Make Literal support more datatypes

  • [SPARK-41326] [SC-116972] [CONNECT] Fix deduplicate is missing input

  • [SPARK-41316] [SC-116900][SQL] Enable tail-recursion wherever possible

  • [SPARK-41297] [SC-116931] [CONNECT] [PYTHON] Support String Expressions in filter.

  • [SPARK-41256] [SC-116932][SC-116883][CONNECT] Implement DataFrame.withColumn(s)

  • [SPARK-41182] [SC-116632][SQL] Assign a name to the error class LEGACYERROR_TEMP_1102

  • [SPARK-41181] [SC-116680][SQL] Migrate the map options errors onto error classes

  • [SPARK-40940] [SC-115993][12.x] Remove Multi-stateful operator checkers for streaming queries.

  • [SPARK-41310] [SC-116885][CONNECT][PYTHON] Implement DataFrame.toDF

  • [SPARK-41179] [SC-116631][SQL] Assign a name to the error class LEGACYERROR_TEMP_1092

  • [SPARK-41003] [SC-116741][SQL] BHJ LeftAnti does not update numOutputRows when codegen is disabled

  • [SPARK-41148] [SC-116878][CONNECT][PYTHON] Implement DataFrame.dropna and DataFrame.na.drop

  • [SPARK-41217] [SC-116380][SQL] Add the error class FAILED_FUNCTION_CALL

  • [SPARK-41308] [SC-116875][CONNECT][PYTHON] Improve DataFrame.count()

  • [SPARK-41301] [SC-116786] [CONNECT] Homogenize Behavior for SparkSession.range()

  • [SPARK-41306] [SC-116860][CONNECT] Improve Connect Expression proto documentation

  • [SPARK-41280] [SC-116733][CONNECT] Implement DataFrame.tail

  • [SPARK-41300] [SC-116751] [CONNECT] Unset schema is interpreted as Schema

  • [SPARK-41255] [SC-116730][SC-116695] [CONNECT] Rename RemoteSparkSession

  • [SPARK-41250] [SC-116788][SC-116633][CONNECT][PYTHON] DataFrame. toPandas should not return optional pandas dataframe

  • [SPARK-41291] [SC-116738][CONNECT][PYTHON] DataFrame.explain should print and return None

  • [SPARK-41278] [SC-116732][CONNECT] Clean up unused QualifiedAttribute in Expression.proto

  • [SPARK-41097] [SC-116653][CORE][SQL][SS][PROTOBUF] Remove redundant collection conversion base on Scala 2.13 code

  • [SPARK-41261] [SC-116718][PYTHON][SS] Fix issue for applyInPandasWithState when the columns of grouping keys are not placed in order from earliest

  • [SPARK-40872] [SC-116717][3.3] Fallback to original shuffle block when a push-merged shuffle chunk is zero-size

  • [SPARK-41114] [SC-116628][CONNECT] Support local data for LocalRelation

  • [SPARK-41216] [SC-116678][CONNECT][PYTHON] Implement DataFrame.{isLocal, isStreaming, printSchema, inputFiles}

  • [SPARK-41238] [SC-116670][CONNECT][PYTHON] Support more built-in datatypes

  • [SPARK-41230] [SC-116674][CONNECT][PYTHON] Remove str from Aggregate expression type

  • [SPARK-41224] [SC-116652][SPARK-41165][SPARK-41184][CONNECT] Optimized Arrow-based collect implementation to stream from server to client

  • [SPARK-41222] [SC-116625][CONNECT][PYTHON] Unify the typing definitions

  • [SPARK-41225] [SC-116623] [CONNECT] [PYTHON] Disable unsupported functions.

  • [SPARK-41201] [SC-116526][CONNECT][PYTHON] Implement DataFrame.SelectExpr in Python client

  • [SPARK-41203] [SC-116258] [CONNECT] Support Dataframe.tansform in Python client.

  • [SPARK-41213] [SC-116375][CONNECT][PYTHON] Implement DataFrame.__repr__ and DataFrame.dtypes

  • [SPARK-41169] [SC-116378][CONNECT][PYTHON] Implement DataFrame.drop

  • [SPARK-41172] [SC-116245][SQL] Migrate the ambiguous ref error to an error class

  • [SPARK-41122] [SC-116141][CONNECT] Explain API can support different modes

  • [SPARK-41209] [SC-116584][SC-116376][PYTHON] Improve PySpark type inference in mergetype method

  • [SPARK-41196] [SC-116555][SC-116179] [CONNECT] Homogenize the protobuf version across the Spark connect server to use the same major version.

  • [SPARK-35531] [SC-116409][SQL] Update hive table stats without unnecessary convert

  • [SPARK-41154] [SC-116289][SQL] Incorrect relation caching for queries with time travel spec

  • [SPARK-41212] [SC-116554][SC-116389][CONNECT][PYTHON] Implement DataFrame.isEmpty

  • [SPARK-41135] [SC-116400][SQL] Rename UNSUPPORTED_EMPTY_LOCATION to INVALID_EMPTY_LOCATION

  • [SPARK-41183] [SC-116265][SQL] Add an extension API to do plan normalization for caching

  • [SPARK-41054] [SC-116447][UI][CORE] Support RocksDB as KVStore in live UI

  • [SPARK-38550] [SC-115223]Revert “[SQL][CORE] Use a disk-based store to save more debug information for live UI”

  • [SPARK-41173] [SC-116185][SQL] Move require() out from the constructors of string expressions

  • [SPARK-41188] [SC-116242][CORE][ML] Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes

  • [SPARK-41130] [SC-116155][SQL] Rename OUT_OF_DECIMAL_TYPE_RANGE to NUMERIC_OUT_OF_SUPPORTED_RANGE

  • [SPARK-41175] [SC-116238][SQL] Assign a name to the error class LEGACYERROR_TEMP_1078

  • [SPARK-41106] [SC-116073][SQL] Reduce collection conversion when create AttributeMap

  • [SPARK-41139] [SC-115983][SQL] Improve error class: PYTHON_UDF_IN_ON_CLAUSE

  • [SPARK-40657] [SC-115997][PROTOBUF] Require shading for Java class jar, improve error handling

  • [SPARK-40999] [SC-116168] Hint propagation to subqueries

  • [SPARK-41017] [SC-116054][SQL] Support column pruning with multiple nondeterministic Filters

  • [SPARK-40834] [SC-114773][SQL] Use SparkListenerSQLExecutionEnd to track final SQL status in UI

  • [SPARK-41118] [SC-116027][SQL] to_number/try_to_number should return null when format is null

  • [SPARK-39799] [SC-115984][SQL] DataSourceV2: View catalog interface

  • [SPARK-40665] [SC-116210][SC-112300][CONNECT] Avoid embedding Spark Connect in the Apache Spark binary release

  • [SPARK-41048] [SC-116043][SQL] Improve output partitioning and ordering with AQE cache

  • [SPARK-41198] [SC-116256][SS] Fix metrics in streaming query having CTE and DSv1 streaming source

  • [SPARK-41199] [SC-116244][SS] Fix metrics issue when DSv1 streaming source and DSv2 streaming source are co-used

  • [SPARK-40957] [SC-116261][SC-114706] Add in memory cache in HDFSMetadataLog

  • [SPARK-40940] Revert “[SC-115993] Remove Multi-stateful operator checkers for streaming queries.”

  • [SPARK-41090] [SC-116040][SQL] Throw Exception for db_name.view_name when creating temp view by Dataset API

  • [SPARK-41133] [SC-116085][SQL] Integrate UNSCALED_VALUE_TOO_LARGE_FOR_PRECISION into NUMERIC_VALUE_OUT_OF_RANGE

  • [SPARK-40557] [SC-116182][SC-111442][CONNECT] Code Dump 9 Commits

  • [SPARK-40448] [SC-114447][SC-111314][CONNECT] Spark Connect build as Driver Plugin with Shaded Dependencies

  • [SPARK-41096] [SC-115812][SQL] Support reading parquet FIXED_LEN_BYTE_ARRAY type

  • [SPARK-41140] [SC-115879][SQL] Rename the error class _LEGACY_ERROR_TEMP_2440 to INVALID_WHERE_CONDITION

  • [SPARK-40918] [SC-114438][SQL] Mismatch between FileSourceScanExec and Orc and ParquetFileFormat on producing columnar output

  • [SPARK-41155] [SC-115991][SQL] Add error message to SchemaColumnConvertNotSupportedException

  • [SPARK-40940] [SC-115993] Remove Multi-stateful operator checkers for streaming queries.

  • [SPARK-41098] [SC-115790][SQL] Rename GROUP_BY_POS_REFERS_AGG_EXPR to GROUP_BY_POS_AGGREGATE

  • [SPARK-40755] [SC-115912][SQL] Migrate type check failures of number formatting onto error classes

  • [SPARK-41059] [SC-115658][SQL] Rename _LEGACY_ERROR_TEMP_2420 to NESTED_AGGREGATE_FUNCTION

  • [SPARK-41044] [SC-115662][SQL] Convert DATATYPE_MISMATCH.UNSPECIFIED_FRAME to INTERNAL_ERROR

  • [SPARK-40973] [SC-115132][SQL] Rename _LEGACY_ERROR_TEMP_0055 to UNCLOSED_BRACKETED_COMMENT

System environment

  • Operating System: Ubuntu 20.04.5 LTS

  • Java: Zulu 8.64.0.19-CA-linux64

  • Scala: 2.12.14

  • Python: 3.9.5

  • R: 4.2.2

  • Delta Lake: 2.2.0

Installed Python libraries

Library

Version

Library

Version

Library

Version

argon2-cffi

21.3.0

argon2-cffi-bindings

21.2.0

asttokens

2.0.5

attrs

21.4.0

backcall

0.2.0

backports.entry-points-selectable

1.2.0

beautifulsoup4

4.11.1

black

22.3.0

bleach

4.1.0

boto3

1.21.32

botocore

1.24.32

certifi

2021.10.8

cffi

1.15.0

chardet

4.0.0

charset-normalizer

2.0.4

click

8.0.4

cryptography

3.4.8

cycler

0.11.0

Cython

0.29.28

dbus-python

1.2.16

debugpy

1.5.1

decorator

5.1.1

defusedxml

0.7.1

distlib

0.3.6

docstring-to-markdown

0.11

entrypoints

0.4

executing

0.8.3

facets-overview

1.0.0

fastjsonschema

2.16.2

filelock

3.8.2

fonttools

4.25.0

idna

3.3

ipykernel

6.15.3

ipython

8.5.0

ipython-genutils

0.2.0

ipywidgets

7.7.2

jedi

0.18.1

Jinja2

2.11.3

jmespath

0.10.0

joblib

1.1.0

jsonschema

4.4.0

jupyter-client

6.1.12

jupyter_core

4.11.2

jupyterlab-pygments

0.1.2

jupyterlab-widgets

1.0.0

kiwisolver

1.3.2

MarkupSafe

2.0.1

matplotlib

3.5.1

matplotlib-inline

0.1.2

mccabe

0.7.0

mistune

0.8.4

mypy-extensions

0.4.3

nbclient

0.5.13

nbconvert

6.4.4

nbformat

5.3.0

nest-asyncio

1.5.5

nodeenv

1.7.0

notebook

6.4.8

numpy

1.21.5

packaging

21.3

pandas

1.4.2

pandocfilters

1.5.0

parso

0.8.3

pathspec

0.9.0

patsy

0.5.2

pexpect

4.8.0

pickleshare

0.7.5

Pillow

9.0.1

pip

21.2.4

platformdirs

2.6.0

plotly

5.6.0

pluggy

1.0.0

prometheus-client

0.13.1

prompt-toolkit

3.0.20

protobuf

3.19.4

psutil

5.8.0

psycopg2

2.9.3

ptyprocess

0.7.0

pure-eval

0.2.2

pyarrow

7.0.0

pycparser

2.21

pyflakes

2.5.0

Pygments

2.11.2

PyGObject

3.36.0

pyodbc

4.0.32

pyparsing

3.0.4

pyright

1.1.283

pyrsistent

0.18.0

python-dateutil

2.8.2

python-lsp-jsonrpc

1.0.0

python-lsp-server

1.6.0

pytz

2021.3

pyzmq

22.3.0

requests

2.27.1

requests-unixsocket

0.2.0

rope

0.22.0

s3transfer

0.5.0

scikit-learn

1.0.2

scipy

1.7.3

seaborn

0.11.2

Send2Trash

1.8.0

setuptools

61.2.0

six

1.16.0

soupsieve

2.3.1

ssh-import-id

5.10

stack-data

0.2.0

statsmodels

0.13.2

tenacity

8.0.1

terminado

0.13.1

testpath

0.5.0

threadpoolctl

2.2.0

tokenize-rt

4.2.1

tomli

1.2.2

tornado

6.1

traitlets

5.1.1

typing_extensions

4.1.1

ujson

5.1.0

unattended-upgrades

0.1

urllib3

1.26.9

virtualenv

20.8.0

wcwidth

0.2.5

webencodings

0.5.1

whatthepatch

1.0.3

wheel

0.37.0

widgetsnbextension

3.6.1

yapf

0.31.0

Installed R libraries

R libraries are installed from the Microsoft CRAN snapshot on 2022-11-11.

Library

Version

Library

Version

Library

Version

arrow

10.0.0

askpass

1.1

assertthat

0.2.1

backports

1.4.1

base

4.2.2

base64enc

0.1-3

bit

4.0.4

bit64

4.0.5

blob

1.2.3

boot

1.3-28

brew

1.0-8

brio

1.1.3

broom

1.0.1

bslib

0.4.1

cachem

1.0.6

callr

3.7.3

caret

6.0-93

cellranger

1.1.0

chron

2.3-58

class

7.3-20

cli

3.4.1

clipr

0.8.0

clock

0.6.1

cluster

2.1.4

codetools

0.2-18

colorspace

2.0-3

commonmark

1.8.1

compiler

4.2.2

config

0.3.1

cpp11

0.4.3

crayon

1.5.2

credentials

1.3.2

curl

4.3.3

data.table

1.14.4

datasets

4.2.2

DBI

1.1.3

dbplyr

2.2.1

desc

1.4.2

devtools

2.4.5

diffobj

0.3.5

digest

0.6.30

downlit

0.4.2

dplyr

1.0.10

dtplyr

1.2.2

e1071

1.7-12

ellipsis

0.3.2

evaluate

0.18

fansi

1.0.3

farver

2.1.1

fastmap

1.1.0

fontawesome

0.4.0

forcats

0.5.2

foreach

1.5.2

foreign

0.8-82

forge

0.2.0

fs

1.5.2

future

1.29.0

future.apply

1.10.0

gargle

1.2.1

generics

0.1.3

gert

1.9.1

ggplot2

3.4.0

gh

1.3.1

gitcreds

0.1.2

glmnet

4.1-4

globals

0.16.1

glue

1.6.2

googledrive

2.0.0

googlesheets4

1.0.1

gower

1.0.0

graphics

4.2.2

grDevices

4.2.2

grid

4.2.2

gridExtra

2.3

gsubfn

0.7

gtable

0.3.1

hardhat

1.2.0

haven

2.5.1

highr

0.9

hms

1.1.2

htmltools

0.5.3

htmlwidgets

1.5.4

httpuv

1.6.6

httr

1.4.4

ids

1.0.1

ini

0.3.1

ipred

0.9-13

isoband

0.2.6

iterators

1.0.14

jquerylib

0.1.4

jsonlite

1.8.3

KernSmooth

2.23-20

knitr

1.40

labeling

0.4.2

later

1.3.0

lattice

0.20-45

lava

1.7.0

lifecycle

1.0.3

listenv

0.8.0

lubridate

1.9.0

magrittr

2.0.3

markdown

1.3

MASS

7.3-58

Matrix

1.5-1

memoise

2.0.1

methods

4.2.2

mgcv

1.8-41

mime

0.12

miniUI

0.1.1.1

ModelMetrics

1.2.2.2

modelr

0.1.9

munsell

0.5.0

nlme

3.1-160

nnet

7.3-18

numDeriv

2016.8-1.1

openssl

2.0.4

parallel

4.2.2

parallelly

1.32.1

pillar

1.8.1

pkgbuild

1.3.1

pkgconfig

2.0.3

pkgdown

2.0.6

pkgload

1.3.1

plogr

0.2.0

plyr

1.8.7

praise

1.0.0

prettyunits

1.1.1

pROC

1.18.0

processx

3.8.0

prodlim

2019.11.13

profvis

0.3.7

progress

1.2.2

progressr

0.11.0

promises

1.2.0.1

proto

1.0.0

proxy

0.4-27

ps

1.7.2

purrr

0.3.5

r2d3

0.2.6

R6

2.5.1

ragg

1.2.4

randomForest

4.7-1.1

rappdirs

0.3.3

rcmdcheck

1.4.0

RColorBrewer

1.1-3

Rcpp

1.0.9

RcppEigen

0.3.3.9.3

readr

2.1.3

readxl

1.4.1

recipes

1.0.3

rematch

1.0.1

rematch2

2.1.2

remotes

2.4.2

reprex

2.0.2

reshape2

1.4.4

rlang

1.0.6

rmarkdown

2.18

RODBC

1.3-19

roxygen2

7.2.1

rpart

4.1.19

rprojroot

2.0.3

Rserve

1.8-11

RSQLite

2.2.18

rstudioapi

0.14

rversions

2.1.2

rvest

1.0.3

sass

0.4.2

scales

1.2.1

selectr

0.4-2

sessioninfo

1.2.2

shape

1.4.6

shiny

1.7.3

sourcetools

0.1.7

sparklyr

1.7.8

SparkR

3.3.1

spatial

7.3-11

splines

4.2.2

sqldf

0.4-11

SQUAREM

2021.1

stats

4.2.2

stats4

4.2.2

stringi

1.7.8

stringr

1.4.1

survival

3.4-0

sys

3.4.1

systemfonts

1.0.4

tcltk

4.2.2

testthat

3.1.5

textshaping

0.3.6

tibble

3.1.8

tidyr

1.2.1

tidyselect

1.2.0

tidyverse

1.3.2

timechange

0.1.1

timeDate

4021.106

tinytex

0.42

tools

4.2.2

tzdb

0.3.0

urlchecker

1.0.1

usethis

2.1.6

utf8

1.2.2

utils

4.2.2

uuid

1.1-0

vctrs

0.5.0

viridisLite

0.4.1

vroom

1.6.0

waldo

0.4.0

whisker

0.4

withr

2.5.0

xfun

0.34

xml2

1.3.3

xopen

1.0.0

xtable

1.8-4

yaml

2.3.6

zip

2.2.2

Installed Java and Scala libraries (Scala 2.12 cluster version)

Group ID

Artifact ID

Version

antlr

antlr

2.7.7

com.amazonaws

amazon-kinesis-client

1.12.0

com.amazonaws

aws-java-sdk-autoscaling

1.12.189

com.amazonaws

aws-java-sdk-cloudformation

1.12.189

com.amazonaws

aws-java-sdk-cloudfront

1.12.189

com.amazonaws

aws-java-sdk-cloudhsm

1.12.189

com.amazonaws

aws-java-sdk-cloudsearch

1.12.189

com.amazonaws

aws-java-sdk-cloudtrail

1.12.189

com.amazonaws

aws-java-sdk-cloudwatch

1.12.189

com.amazonaws

aws-java-sdk-cloudwatchmetrics

1.12.189

com.amazonaws

aws-java-sdk-codedeploy

1.12.189

com.amazonaws

aws-java-sdk-cognitoidentity

1.12.189

com.amazonaws

aws-java-sdk-cognitosync

1.12.189

com.amazonaws

aws-java-sdk-config

1.12.189

com.amazonaws

aws-java-sdk-core

1.12.189

com.amazonaws

aws-java-sdk-datapipeline

1.12.189

com.amazonaws

aws-java-sdk-directconnect

1.12.189

com.amazonaws

aws-java-sdk-directory

1.12.189

com.amazonaws

aws-java-sdk-dynamodb

1.12.189

com.amazonaws

aws-java-sdk-ec2

1.12.189

com.amazonaws

aws-java-sdk-ecs

1.12.189

com.amazonaws

aws-java-sdk-efs

1.12.189

com.amazonaws

aws-java-sdk-elasticache

1.12.189

com.amazonaws

aws-java-sdk-elasticbeanstalk

1.12.189

com.amazonaws

aws-java-sdk-elasticloadbalancing

1.12.189

com.amazonaws

aws-java-sdk-elastictranscoder

1.12.189

com.amazonaws

aws-java-sdk-emr

1.12.189

com.amazonaws

aws-java-sdk-glacier

1.12.189

com.amazonaws

aws-java-sdk-glue

1.12.189

com.amazonaws

aws-java-sdk-iam

1.12.189

com.amazonaws

aws-java-sdk-importexport

1.12.189

com.amazonaws

aws-java-sdk-kinesis

1.12.189

com.amazonaws

aws-java-sdk-kms

1.12.189

com.amazonaws

aws-java-sdk-lambda

1.12.189

com.amazonaws

aws-java-sdk-logs

1.12.189

com.amazonaws

aws-java-sdk-machinelearning

1.12.189

com.amazonaws

aws-java-sdk-opsworks

1.12.189

com.amazonaws

aws-java-sdk-rds

1.12.189

com.amazonaws

aws-java-sdk-redshift

1.12.189

com.amazonaws

aws-java-sdk-route53

1.12.189

com.amazonaws

aws-java-sdk-s3

1.12.189

com.amazonaws

aws-java-sdk-ses

1.12.189

com.amazonaws

aws-java-sdk-simpledb

1.12.189

com.amazonaws

aws-java-sdk-simpleworkflow

1.12.189

com.amazonaws

aws-java-sdk-sns

1.12.189

com.amazonaws

aws-java-sdk-sqs

1.12.189

com.amazonaws

aws-java-sdk-ssm

1.12.189

com.amazonaws

aws-java-sdk-storagegateway

1.12.189

com.amazonaws

aws-java-sdk-sts

1.12.189

com.amazonaws

aws-java-sdk-support

1.12.189

com.amazonaws

aws-java-sdk-swf-libraries

1.11.22

com.amazonaws

aws-java-sdk-workspaces

1.12.189

com.amazonaws

jmespath-java

1.12.189

com.chuusai

shapeless_2.12

2.3.3

com.clearspring.analytics

stream

2.9.6

com.databricks

Rserve

1.8-3

com.databricks

jets3t

0.7.1-0

com.databricks.scalapb

compilerplugin_2.12

0.4.15-10

com.databricks.scalapb

scalapb-runtime_2.12

0.4.15-10

com.esotericsoftware

kryo-shaded

4.0.2

com.esotericsoftware

minlog

1.3.0

com.fasterxml

classmate

1.3.4

com.fasterxml.jackson.core

jackson-annotations

2.13.4

com.fasterxml.jackson.core

jackson-core

2.13.4

com.fasterxml.jackson.core

jackson-databind

2.13.4.2

com.fasterxml.jackson.dataformat

jackson-dataformat-cbor

2.13.4

com.fasterxml.jackson.datatype

jackson-datatype-joda

2.13.4

com.fasterxml.jackson.datatype

jackson-datatype-jsr310

2.13.4

com.fasterxml.jackson.module

jackson-module-paranamer

2.13.4

com.fasterxml.jackson.module

jackson-module-scala_2.12

2.13.4

com.github.ben-manes.caffeine

caffeine

2.3.4

com.github.fommil

jniloader

1.1

com.github.fommil.netlib

core

1.1.2

com.github.fommil.netlib

native_ref-java

1.1

com.github.fommil.netlib

native_ref-java-natives

1.1

com.github.fommil.netlib

native_system-java

1.1

com.github.fommil.netlib

native_system-java-natives

1.1

com.github.fommil.netlib

netlib-native_ref-linux-x86_64-natives

1.1

com.github.fommil.netlib

netlib-native_system-linux-x86_64-natives

1.1

com.github.luben

zstd-jni

1.5.2-1

com.github.wendykierp

JTransforms

3.1

com.google.code.findbugs

jsr305

3.0.0

com.google.code.gson

gson

2.8.6

com.google.crypto.tink

tink

1.6.1

com.google.flatbuffers

flatbuffers-java

1.12.0

com.google.guava

guava

15.0

com.google.protobuf

protobuf-java

2.6.1

com.h2database

h2

2.0.204

com.helger

profiler

1.1.1

com.jcraft

jsch

0.1.50

com.jolbox

bonecp

0.8.0.RELEASE

com.lihaoyi

sourcecode_2.12

0.1.9

com.microsoft.azure

azure-data-lake-store-sdk

2.3.9

com.ning

compress-lzf

1.1

com.sun.mail

javax.mail

1.5.2

com.tdunning

json

1.8

com.thoughtworks.paranamer

paranamer

2.8

com.trueaccord.lenses

lenses_2.12

0.4.12

com.twitter

chill-java

0.10.0

com.twitter

chill_2.12

0.10.0

com.twitter

util-app_2.12

7.1.0

com.twitter

util-core_2.12

7.1.0

com.twitter

util-function_2.12

7.1.0

com.twitter

util-jvm_2.12

7.1.0

com.twitter

util-lint_2.12

7.1.0

com.twitter

util-registry_2.12

7.1.0

com.twitter

util-stats_2.12

7.1.0

com.typesafe

config

1.2.1

com.typesafe.scala-logging

scala-logging_2.12

3.7.2

com.uber

h3

3.7.0

com.univocity

univocity-parsers

2.9.1

com.zaxxer

HikariCP

4.0.3

commons-cli

commons-cli

1.5.0

commons-codec

commons-codec

1.15

commons-collections

commons-collections

3.2.2

commons-dbcp

commons-dbcp

1.4

commons-fileupload

commons-fileupload

1.3.3

commons-httpclient

commons-httpclient

3.1

commons-io

commons-io

2.11.0

commons-lang

commons-lang

2.6

commons-logging

commons-logging

1.1.3

commons-pool

commons-pool

1.5.4

dev.ludovic.netlib

arpack

2.2.1

dev.ludovic.netlib

blas

2.2.1

dev.ludovic.netlib

lapack

2.2.1

info.ganglia.gmetric4j

gmetric4j

1.0.10

io.airlift

aircompressor

0.21

io.delta

delta-sharing-spark_2.12

0.6.2

io.dropwizard.metrics

metrics-core

4.1.1

io.dropwizard.metrics

metrics-graphite

4.1.1

io.dropwizard.metrics

metrics-healthchecks

4.1.1

io.dropwizard.metrics

metrics-jetty9

4.1.1

io.dropwizard.metrics

metrics-jmx

4.1.1

io.dropwizard.metrics

metrics-json

4.1.1

io.dropwizard.metrics

metrics-jvm

4.1.1

io.dropwizard.metrics

metrics-servlets

4.1.1

io.netty

netty-all

4.1.74.Final

io.netty

netty-buffer

4.1.74.Final

io.netty

netty-codec

4.1.74.Final

io.netty

netty-common

4.1.74.Final

io.netty

netty-handler

4.1.74.Final

io.netty

netty-resolver

4.1.74.Final

io.netty

netty-tcnative-classes

2.0.48.Final

io.netty

netty-transport

4.1.74.Final

io.netty

netty-transport-classes-epoll

4.1.74.Final

io.netty

netty-transport-classes-kqueue

4.1.74.Final

io.netty

netty-transport-native-epoll-linux-aarch_64

4.1.74.Final

io.netty

netty-transport-native-epoll-linux-x86_64

4.1.74.Final

io.netty

netty-transport-native-kqueue-osx-aarch_64

4.1.74.Final

io.netty

netty-transport-native-kqueue-osx-x86_64

4.1.74.Final

io.netty

netty-transport-native-unix-common

4.1.74.Final

io.prometheus

simpleclient

0.7.0

io.prometheus

simpleclient_common

0.7.0

io.prometheus

simpleclient_dropwizard

0.7.0

io.prometheus

simpleclient_pushgateway

0.7.0

io.prometheus

simpleclient_servlet

0.7.0

io.prometheus.jmx

collector

0.12.0

jakarta.annotation

jakarta.annotation-api

1.3.5

jakarta.servlet

jakarta.servlet-api

4.0.3

jakarta.validation

jakarta.validation-api

2.0.2

jakarta.ws.rs

jakarta.ws.rs-api

2.1.6

javax.activation

activation

1.1.1

javax.el

javax.el-api

2.2.4

javax.jdo

jdo-api

3.0.1

javax.transaction

jta

1.1

javax.transaction

transaction-api

1.1

javax.xml.bind

jaxb-api

2.2.11

javolution

javolution

5.5.1

jline

jline

2.14.6

joda-time

joda-time

2.10.13

net.java.dev.jna

jna

5.8.0

net.razorvine

pickle

1.2

net.sf.jpam

jpam

1.1

net.sf.opencsv

opencsv

2.3

net.sf.supercsv

super-csv

2.2.0

net.snowflake

snowflake-ingest-sdk

0.9.6

net.snowflake

snowflake-jdbc

3.13.22

net.sourceforge.f2j

arpack_combined_all

0.1

org.acplt.remotetea

remotetea-oncrpc

1.1.2

org.antlr

ST4

4.0.4

org.antlr

antlr-runtime

3.5.2

org.antlr

antlr4-runtime

4.8

org.antlr

stringtemplate

3.2.1

org.apache.ant

ant

1.9.2

org.apache.ant

ant-jsch

1.9.2

org.apache.ant

ant-launcher

1.9.2

org.apache.arrow

arrow-format

7.0.0

org.apache.arrow

arrow-memory-core

7.0.0

org.apache.arrow

arrow-memory-netty

7.0.0

org.apache.arrow

arrow-vector

7.0.0

org.apache.avro

avro

1.11.0

org.apache.avro

avro-ipc

1.11.0

org.apache.avro

avro-mapred

1.11.0

org.apache.commons

commons-collections4

4.4

org.apache.commons

commons-compress

1.21

org.apache.commons

commons-crypto

1.1.0

org.apache.commons

commons-lang3

3.12.0

org.apache.commons

commons-math3

3.6.1

org.apache.commons

commons-text

1.10.0

org.apache.curator

curator-client

2.13.0

org.apache.curator

curator-framework

2.13.0

org.apache.curator

curator-recipes

2.13.0

org.apache.derby

derby

10.14.2.0

org.apache.hadoop

hadoop-client-api

3.3.4-databricks

org.apache.hadoop

hadoop-client-runtime

3.3.4

org.apache.hive

hive-beeline

2.3.9

org.apache.hive

hive-cli

2.3.9

org.apache.hive

hive-jdbc

2.3.9

org.apache.hive

hive-llap-client

2.3.9

org.apache.hive

hive-llap-common

2.3.9

org.apache.hive

hive-serde

2.3.9

org.apache.hive

hive-shims

2.3.9

org.apache.hive

hive-storage-api

2.8.1

org.apache.hive.shims

hive-shims-0.23

2.3.9

org.apache.hive.shims

hive-shims-common

2.3.9

org.apache.hive.shims

hive-shims-scheduler

2.3.9

org.apache.httpcomponents

httpclient

4.5.13

org.apache.httpcomponents

httpcore

4.4.14

org.apache.ivy

ivy

2.5.0

org.apache.logging.log4j

log4j-1.2-api

2.18.0

org.apache.logging.log4j

log4j-api

2.18.0

org.apache.logging.log4j

log4j-core

2.18.0

org.apache.logging.log4j

log4j-slf4j-impl

2.18.0

org.apache.mesos

mesos-shaded-protobuf

1.4.0

org.apache.orc

orc-core

1.7.6

org.apache.orc

orc-mapreduce

1.7.6

org.apache.orc

orc-shims

1.7.6

org.apache.parquet

parquet-column

1.12.3-databricks-0002

org.apache.parquet

parquet-common

1.12.3-databricks-0002

org.apache.parquet

parquet-encoding

1.12.3-databricks-0002

org.apache.parquet

parquet-format-structures

1.12.3-databricks-0002

org.apache.parquet

parquet-hadoop

1.12.3-databricks-0002

org.apache.parquet

parquet-jackson

1.12.3-databricks-0002

org.apache.thrift

libfb303

0.9.3

org.apache.thrift

libthrift

0.12.0

org.apache.xbean

xbean-asm9-shaded

4.20

org.apache.yetus

audience-annotations

0.13.0

org.apache.zookeeper

zookeeper

3.6.2

org.apache.zookeeper

zookeeper-jute

3.6.2

org.checkerframework

checker-qual

3.5.0

org.codehaus.jackson

jackson-core-asl

1.9.13

org.codehaus.jackson

jackson-mapper-asl

1.9.13

org.codehaus.janino

commons-compiler

3.0.16

org.codehaus.janino

janino

3.0.16

org.datanucleus

datanucleus-api-jdo

4.2.4

org.datanucleus

datanucleus-core

4.1.17

org.datanucleus

datanucleus-rdbms

4.1.19

org.datanucleus

javax.jdo

3.2.0-m3

org.eclipse.jetty

jetty-client

9.4.46.v20220331

org.eclipse.jetty

jetty-continuation

9.4.46.v20220331

org.eclipse.jetty

jetty-http

9.4.46.v20220331

org.eclipse.jetty

jetty-io

9.4.46.v20220331

org.eclipse.jetty

jetty-jndi

9.4.46.v20220331

org.eclipse.jetty

jetty-plus

9.4.46.v20220331

org.eclipse.jetty

jetty-proxy

9.4.46.v20220331

org.eclipse.jetty

jetty-security

9.4.46.v20220331

org.eclipse.jetty

jetty-server

9.4.46.v20220331

org.eclipse.jetty

jetty-servlet

9.4.46.v20220331

org.eclipse.jetty

jetty-servlets

9.4.46.v20220331

org.eclipse.jetty

jetty-util

9.4.46.v20220331

org.eclipse.jetty

jetty-util-ajax

9.4.46.v20220331

org.eclipse.jetty

jetty-webapp

9.4.46.v20220331

org.eclipse.jetty

jetty-xml

9.4.46.v20220331

org.eclipse.jetty.websocket

websocket-api

9.4.46.v20220331

org.eclipse.jetty.websocket

websocket-client

9.4.46.v20220331

org.eclipse.jetty.websocket

websocket-common

9.4.46.v20220331

org.eclipse.jetty.websocket

websocket-server

9.4.46.v20220331

org.eclipse.jetty.websocket

websocket-servlet

9.4.46.v20220331

org.fusesource.leveldbjni

leveldbjni-all

1.8

org.glassfish.hk2

hk2-api

2.6.1

org.glassfish.hk2

hk2-locator

2.6.1

org.glassfish.hk2

hk2-utils

2.6.1

org.glassfish.hk2

osgi-resource-locator

1.0.3

org.glassfish.hk2.external

aopalliance-repackaged

2.6.1

org.glassfish.hk2.external

jakarta.inject

2.6.1

org.glassfish.jersey.containers

jersey-container-servlet

2.36

org.glassfish.jersey.containers

jersey-container-servlet-core

2.36

org.glassfish.jersey.core

jersey-client

2.36

org.glassfish.jersey.core

jersey-common

2.36

org.glassfish.jersey.core

jersey-server

2.36

org.glassfish.jersey.inject

jersey-hk2

2.36

org.hibernate.validator

hibernate-validator

6.1.0.Final

org.javassist

javassist

3.25.0-GA

org.jboss.logging

jboss-logging

3.3.2.Final

org.jdbi

jdbi

2.63.1

org.jetbrains

annotations

17.0.0

org.joda

joda-convert

1.7

org.jodd

jodd-core

3.5.2

org.json4s

json4s-ast_2.12

3.7.0-M11

org.json4s

json4s-core_2.12

3.7.0-M11

org.json4s

json4s-jackson_2.12

3.7.0-M11

org.json4s

json4s-scalap_2.12

3.7.0-M11

org.lz4

lz4-java

1.8.0

org.mariadb.jdbc

mariadb-java-client

2.7.4

org.mlflow

mlflow-spark

1.27.0

org.objenesis

objenesis

2.5.1

org.postgresql

postgresql

42.3.3

org.roaringbitmap

RoaringBitmap

0.9.25

org.roaringbitmap

shims

0.9.25

org.rocksdb

rocksdbjni

6.24.2

org.rosuda.REngine

REngine

2.1.0

org.scala-lang

scala-compiler_2.12

2.12.14

org.scala-lang

scala-library_2.12

2.12.14

org.scala-lang

scala-reflect_2.12

2.12.14

org.scala-lang.modules

scala-collection-compat_2.12

2.4.3

org.scala-lang.modules

scala-parser-combinators_2.12

1.1.2

org.scala-lang.modules

scala-xml_2.12

1.2.0

org.scala-sbt

test-interface

1.0

org.scalacheck

scalacheck_2.12

1.14.2

org.scalactic

scalactic_2.12

3.0.8

org.scalanlp

breeze-macros_2.12

1.2

org.scalanlp

breeze_2.12

1.2

org.scalatest

scalatest_2.12

3.0.8

org.slf4j

jcl-over-slf4j

1.7.36

org.slf4j

jul-to-slf4j

1.7.36

org.slf4j

slf4j-api

1.7.36

org.spark-project.spark

unused

1.0.0

org.threeten

threeten-extra

1.5.0

org.tukaani

xz

1.9

org.typelevel

algebra_2.12

2.0.1

org.typelevel

cats-kernel_2.12

2.1.1

org.typelevel

macro-compat_2.12

1.1.1

org.typelevel

spire-macros_2.12

0.17.0

org.typelevel

spire-platform_2.12

0.17.0

org.typelevel

spire-util_2.12

0.17.0

org.typelevel

spire_2.12

0.17.0

org.wildfly.openssl

wildfly-openssl

1.0.7.Final

org.xerial

sqlite-jdbc

3.8.11.2

org.xerial.snappy

snappy-java

1.1.8.4

org.yaml

snakeyaml

1.24

oro

oro

2.0.8

pl.edu.icm

JLargeArrays

1.5

software.amazon.ion

ion-java

1.0.2

stax

stax-api

1.0.1