Databricks Runtime Maintenance Updates

This page lists maintenance updates issued for supported Databricks Runtime releases.

To add a maintenance update to an existing cluster, restart the cluster.

  • Databricks Runtime 5.2

    • Mar 26, 2019

      • Avoid embedding platform-dependent offsets literally in whole-stage generated code
      • [SPARK-26665][CORE] Fix a bug that BlockTransferService.fetchBlockSync may hang forever.
      • [SPARK-27134][SQL] array_distinct function does not work correctly with columns containing array of array.
      • [SPARK-24669][SQL] Invalidate tables in case of DROP DATABASE CASCADE.
      • [SPARK-26572][SQL] fix aggregate codegen result evaluation.
      • Fixed a bug affecting certain PythonUDFs.
    • Feb 26, 2019

      • [SPARK-26864][SQL] Query may return incorrect result when python udf is used as a left-semi join condition.
      • [SPARK-26887][PYTHON] Create datetime.date directly instead of creating datetime64 as intermediate data.
      • Fixed a bug affecting JDBC/ODBC server.
      • Fixed a bug affecting PySpark.
      • Exclude the hidden files when building HadoopRDD.
      • Fixed a bug in Delta that caused serialization issues.
    • Feb 12, 2019

      • Fixed an issue affecting using Delta with Azure ADLS Gen2 mount points.
      • Fixed an issue that Spark low level network protocol may be broken when sending large RPC error messages with encryption enabled (in HIPAA-Compliant Deployment) or when spark.network.crypto.enabled is set to true).
    • Jan 30, 2019

      • Fixed the StackOverflowError when putting skew join hint on cached relation.
      • Fixed the inconsistency between a SQL cache’s cached RDD and its physical plan, which causes incorrect result.
      • [SPARK-26706][SQL] Fix illegalNumericPrecedence for ByteType.
      • [SPARK-26709][SQL] OptimizeMetadataOnlyQuery does not handle empty records correctly.
      • CSV/JSON data sources should avoid globbing paths when inferring schema.
      • Fixed constraint inference on Window operator.
      • Fixed an issue affecting installing egg libraries with clusters having table ACL enabled.
  • Databricks Runtime 5.1

    • Mar 26, 2019

      • Avoid embedding platform-dependent offsets literally in whole-stage generated code
      • Fixed a bug affecting certain PythonUDFs.
    • Feb 26, 2019

      • [SPARK-26864][SQL] Query may return incorrect result when python udf is used as a left-semi join condition.
      • Fixed a bug affecting JDBC/ODBC server.
      • Exclude the hidden files when building HadoopRDD.
    • Feb 12, 2019

      • Fixed an issue affecting installing egg libraries with clusters having table ACL enabled.
      • Fixed the inconsistency between a SQL cache’s cached RDD and its physical plan, which causes incorrect result.
      • [SPARK-26706][SQL] Fix illegalNumericPrecedence for ByteType.
      • [SPARK-26709][SQL] OptimizeMetadataOnlyQuery does not handle empty records correctly.
      • Fixed constraint inference on Window operator.
      • Fixed an issue that Spark low level network protocol may be broken when sending large RPC error messages with encryption enabled (in HIPAA-Compliant Deployment) or when spark.network.crypto.enabled is set to true).
    • Jan 30, 2019

      • Fixed an issue that can cause df.rdd.count() with UDT to return incorrect answer for certain cases.
      • Fixed an issue affecting installing wheelhouses.
      • [SPARK-26267]Retry when detecting incorrect offsets from Kafka.
      • Fixed a bug that affects multiple file stream sources in a streaming query.
      • Fixed the StackOverflowError when putting skew join hint on cached relation.
      • Fixed the inconsistency between a SQL cache’s cached RDD and its physical plan, which causes incorrect result.
    • Jan 8, 2019

      • Fixed issue that causes the error org.apache.spark.sql.expressions.Window.rangeBetween(long,long) is not whitelisted.
      • [SPARK-26352]join reordering should not change the order of output attributes.
      • [SPARK-26366]ReplaceExceptWithFilter should consider NULL as False.
      • Stability improvement for Delta Lake.
      • Delta Lake is enabled.
      • Databricks IO Cache is enabled for the IO Cache Accelerated instance type.
  • Databricks Runtime 5.0

    • Mar 26, 2019

      • Avoid embedding platform-dependent offsets literally in whole-stage generated code
      • Fixed a bug affecting certain PythonUDFs.
    • Mar 12, 2019

      • [SPARK-26864][SQL] Query may return incorrect result when python udf is used as a left-semi join condition.
    • Feb 26, 2019

      • Fixed a bug affecting JDBC/ODBC server.
      • Exclude the hidden files when building HadoopRDD.
    • Feb 12, 2019

      • Fixed the inconsistency between a SQL cache’s cached RDD and its physical plan, which causes incorrect result.
      • [SPARK-26706][SQL] Fix illegalNumericPrecedence for ByteType.
      • [SPARK-26709][SQL] OptimizeMetadataOnlyQuery does not handle empty records correctly.
      • Fixed constraint inference on Window operator.
      • Fixed an issue that Spark low level network protocol may be broken when sending large RPC error messages with encryption enabled (in HIPAA-Compliant Deployment) or when spark.network.crypto.enabled is set to true).
    • Jan 30, 2019

      • Fixed an issue that can cause df.rdd.count() with UDT to return incorrect answer for certain cases.
      • [SPARK-26267]Retry when detecting incorrect offsets from Kafka.
      • Fixed a bug that affects multiple file stream sources in a streaming query.
      • Fixed the StackOverflowError when putting skew join hint on cached relation.
      • Fixed the inconsistency between a SQL cache’s cached RDD and its physical plan, which causes incorrect result.
    • Jan 8, 2019

      • Fixed issue that caused the error org.apache.spark.sql.expressions.Window.rangeBetween(long,long) is not whitelisted.
      • [SPARK-26352]join reordering should not change the order of output attributes.
      • [SPARK-26366]ReplaceExceptWithFilter should consider NULL as False.
      • Stability improvement for Delta Lake.
      • Delta Lake is enabled.
      • Databricks IO Cache is enabled for the IO Cache Accelerated instance type.
    • Dec 18, 2018

      • [SPARK-26293]Cast exception when having Python UDF in subquery
      • Fixed an issue affecting certain queries using Join and Limit.
      • Redacted credentials from RDD names in Spark UI
    • Dec 6, 2018

      • Fixed an issue that caused incorrect query result when using orderBy followed immediately by groupBy with group-by key as the leading part of the sort-by key.
      • Upgraded Snowflake Connector for Spark from 2.4.9.2-spark_2.4_pre_release to 2.4.10.
      • Only ignore corrupt files after one or more retries when spark.sql.files.ignoreCorruptFiles or spark.sql.files.ignoreMissingFiles flag is enabled.
      • Fixed an issue affecting certain self union queries.
      • Fixed a bug with the thrift server where sessions are sometimes leaked when cancelled.
      • [SPARK-26307]Fixed CTAS when INSERT a partitioned table using Hive SerDe.
      • [SPARK-26147]Python UDFs in join condition fail even when using columns from only one side of join
      • [SPARK-26211]Fix InSet for binary, and struct and array with null.
      • [SPARK-26181]the hasMinMaxStats method of ColumnStatsMap is not correct.
      • Fixed an issue affecting installing Python Wheels in environments without Internet access.
    • Nov 20, 2018

      • Fixed an issue that caused a notebook not usable after cancelling a streaming query.
      • Fixed an issue affecting certain queries using window functions.
      • Fixed an issue affecting a stream from Delta with multiple schema changes.
      • Fixed an issue affecting certain aggregation queries with Left Semi/Anti joins.
      • Fixed an issue affecting reading timestamp columns from Redshift.
  • Databricks Runtime 4.3 (deprecated)

    • Apr 9, 2019

      • [SPARK-26665][CORE] Fix a bug that can cause BlockTransferService.fetchBlockSync to hang forever.
      • [SPARK-24669][SQL] Invalidate tables in case of DROP DATABASE CASCADE.
    • Mar 12, 2019

      • Fixed a bug affecting code generation.
      • Fixed a bug affecting Delta.
    • Feb 26, 2019

      • Fixed a bug affecting JDBC/ODBC server.
    • Feb 12, 2019

      • [SPARK-26709][SQL] OptimizeMetadataOnlyQuery does not handle empty records correctly.
      • Excluding the hidden files when building HadoopRDD.
      • Fixed Parquet Filter Conversion for IN predicate when its value is empty.
      • Fixed an issue that Spark low level network protocol may be broken when sending large RPC error messages with encryption enabled (in HIPAA-Compliant Deployment) or when spark.network.crypto.enabled is set to true).
    • Jan 30, 2019

      • Fixed an issue that can cause df.rdd.count() with UDT to return incorrect answer for certain cases.
      • Fixed the inconsistency between a SQL cache’s cached RDD and its physical plan, which causes incorrect result.
    • Jan 8, 2019

      • Fixed the issue that causes the error org.apache.spark.sql.expressions.Window.rangeBetween(long,long) is not whitelisted.
      • Redacted credentials from RDD names in Spark UI
      • [SPARK-26352]join reordering should not change the order of output attributes.
      • [SPARK-26366]ReplaceExceptWithFilter should consider NULL as False.
      • Delta Lake is enabled.
      • Databricks IO Cache is enabled for the IO Cache Accelerated instance type.
    • Dec 18, 2018

      • [SPARK-25002]Avro: revise the output record namespace.
      • Fixed an issue affecting certain queries using Join and Limit.
      • [SPARK-26307]Fixed CTAS when INSERT a partitioned table using Hive SerDe.
      • Only ignore corrupt files after one or more retries when spark.sql.files.ignoreCorruptFiles or spark.sql.files.ignoreMissingFiles flag is enabled.
      • [SPARK-26181]the hasMinMaxStats method of ColumnStatsMap is not correct.
      • Fixed an issue affecting installing Python Wheels in environments without Internet access.
      • Fixed a performance issue in query analyzer.
      • Fixed an issue in PySpark that caused DataFrame actions failed with “connection refused” error.
      • Fixed an issue affecting certain self union queries.
    • Nov 20, 2018

      • [SPARK-17916][SPARK-25241]Fix empty string being parsed as null when nullValue is set.
      • [SPARK-25387]Fix for NPE caused by bad CSV input.
      • Fixed an issue affecting certain aggregation queries with Left Semi/Anti joins.
      • Fixed an issue affecting reading timestamp columns from Redshift.
    • Nov 6, 2018

      • [SPARK-25741]Long URLs are not rendered properly in web UI.
      • [SPARK-25714]Fix Null Handling in the Optimizer rule BooleanSimplification.
      • Fixed an issue affecting temporary objects cleanup in SQL Data Warehouse connector.
      • [SPARK-25816]Fix attribute resolution in nested extractors.
    • Oct 9, 2018

      • Fixed a bug affecting the output of running SHOW CREATE TABLE on Delta Lake tables.
      • Fixed a bug affecting Union operation.
    • Sep 25, 2018

      • [SPARK-25368][SQL] Incorrect constraint inference returns wrong result.
      • [SPARK-25402][SQL] Null handling in BooleanSimplification.
      • Fixed NotSerializableException in Avro data source.
    • Sep 11, 2018

      • [SPARK-25214][SS] Fix the issue that Kafka v2 source may return duplicated records when failOnDataLoss=false.
      • [SPARK-24987][SS] Fix Kafka consumer leak when no new offsets for TopicPartition.
      • Filter reduction should handle null value correctly.
      • Improved stability of execution engine.
    • Aug 28, 2018

      • Fixed a bug in Delta Lake Delete command that would incorrectly delete the rows where the condition evaluates to null.
      • [SPARK-25142]Add error messages when Python worker could not open socket in _load_from_socket.
    • Aug 23, 2018

      • [SPARK-23935]mapEntry throws org.codehaus.commons.compiler.CompileException.
      • Fixed nullable map issue in Parquet reader.
      • [SPARK-25051][SQL] FixNullability should not stop on AnalysisBarrier.
      • [SPARK-25081]Fixed a bug where ShuffleExternalSorter may access a released memory page when spilling fails to allocate memory.
      • Fixed an interaction between Databricks Delta and Pyspark which could cause transient read failures.
      • [SPARK-25084]“distribute by” on multiple columns (wrap in brackets) may lead to codegen issue.
      • [SPARK-25096]Loosen nullability if the cast is force-nullable.
      • Lowered the default number of threads used by the Delta Lake Optimize command, reducing memory overhead and committing data faster.
      • [SPARK-25114]Fix RecordBinaryComparator when subtraction between two words is divisible by Integer.MAX_VALUE.
      • Fixed secret manager redaction when command partially succeed.
  • Databricks Runtime 4.2 (deprecated)

    • Feb 26, 2019

      • Fixed a bug affecting JDBC/ODBC server.
    • Feb 12, 2019

      • [SPARK-26709][SQL] OptimizeMetadataOnlyQuery does not handle empty records correctly.
      • Excluding the hidden files when building HadoopRDD.
      • Fixed Parquet Filter Conversion for IN predicate when its value is empty.
      • Fixed an issue that Spark low level network protocol may be broken when sending large RPC error messages with encryption enabled (in HIPAA-Compliant Deployment) or when spark.network.crypto.enabled is set to true).
    • Jan 30, 2019

      • Fixed an issue that can cause df.rdd.count() with UDT to return incorrect answer for certain cases.
    • Jan 8, 2019

      • Fixed issue that causes the error org.apache.spark.sql.expressions.Window.rangeBetween(long,long) is not whitelisted.
      • Redacted credentials from RDD names in Spark UI
      • [SPARK-26352]join reordering should not change the order of output attributes.
      • [SPARK-26366]ReplaceExceptWithFilter should consider NULL as False.
      • Delta Lake is enabled.
      • Databricks IO Cache is enabled for the IO Cache Accelerated instance type.
    • Dec 18, 2018

      • [SPARK-25002]Avro: revise the output record namespace.
      • Fixed an issue affecting certain queries using Join and Limit.
      • [SPARK-26307]Fixed CTAS when INSERT a partitioned table using Hive SerDe.
      • Only ignore corrupt files after one or more retries when spark.sql.files.ignoreCorruptFiles or spark.sql.files.ignoreMissingFiles flag is enabled.
      • [SPARK-26181]the hasMinMaxStats method of ColumnStatsMap is not correct.
      • Fixed an issue affecting installing Python Wheels in environments without Internet access.
      • Fixed a performance issue in query analyzer.
      • Fixed an issue in PySpark that caused DataFrame actions failed with “connection refused” error.
      • Fixed an issue affecting certain self union queries.
    • Nov 20, 2018

      • [SPARK-17916][SPARK-25241]Fix empty string being parsed as null when nullValue is set.
      • Fixed an issue affecting certain aggregation queries with Left Semi/Anti joins.
      • Fixed an issue affecting reading timestamp columns from Redshift.
    • Nov 6, 2018

      • [SPARK-25741]Long URLs are not rendered properly in web UI.
      • [SPARK-25714]Fix Null Handling in the Optimizer rule BooleanSimplification.
    • Oct 9, 2018

      • Fixed a bug affecting the output of running SHOW CREATE TABLE on Delta Lake tables.
      • Fixed a bug affecting Union operation.
    • Sep 25, 2018

      • [SPARK-25368][SQL] Incorrect constraint inference returns wrong result.
      • [SPARK-25402][SQL] Null handling in BooleanSimplification.
      • Fixed NotSerializableException in Avro data source.
    • Sep 11, 2018

      • [SPARK-25214][SS] Fix the issue that Kafka v2 source may return duplicated records when failOnDataLoss=false.
      • [SPARK-24987][SS] Fix Kafka consumer leak when no new offsets for TopicPartition.
      • Filter reduction should handle null value correctly.
    • Aug 28, 2018

      • Fixed a bug in Delta Lake Delete command that would incorrectly delete the rows where the condition evaluates to null.
    • Aug 23, 2018

      • Fixed NoClassDefError for Delta Snapshot
      • [SPARK-23935]mapEntry throws org.codehaus.commons.compiler.CompileException.
      • [SPARK-24957][SQL] Average with decimal followed by aggregation returns wrong result. The incorrect results of AVERAGE might be returned. The CAST added in the Average operator will be bypassed if the result of Divide is the same type which it is casted to.
      • [SPARK-25081]Fixed a bug where ShuffleExternalSorter may access a released memory page when spilling fails to allocate memory.
      • Fixed an interaction between Databricks Delta and Pyspark which could cause transient read failures.
      • [SPARK-25114]Fix RecordBinaryComparator when subtraction between two words is divisible by Integer.MAX_VALUE.
      • [SPARK-25084]“distribute by” on multiple columns (wrap in brackets) may lead to codegen issue.
      • [SPARK-24934][SQL] Explicitly whitelist supported types in upper/lower bounds for in-memory partition pruning. When complex data types are used in query filters against cached data, Spark always returns an empty result set. The in-memory stats-based pruning generates incorrect results, because null is set for upper/lower bounds for complex types. The fix is to not use in-memory stats-based pruning for complex types.
      • Fixed secret manager redaction when command partially succeed.
      • Fixed nullable map issue in Parquet reader.
    • Aug 2, 2018

      • Added writeStream.table API in Python.
      • Fixed an issue affecting Delta checkpointing.
      • [SPARK-24867][SQL] Add AnalysisBarrier to DataFrameWriter. SQL cache is not being used when using DataFrameWriter to write a DataFrame with UDF. This is a regression caused by the changes we made in AnalysisBarrier, since not all the Analyzer rules are idempotent.
      • Fixed an issue that could cause mergeInto command to produce incorrect results.
      • Improved stability on accessing Azure Data Lake Storage Gen1.
      • [SPARK-24809]Serializing LongHashedRelation in executor may result in data error.
      • [SPARK-24878][SQL] Fix reverse function for array type of primitive type containing null.
    • July 11, 2018

      • Fixed a bug in query execution that would cause aggregations on decimal columns with different precisions to return incorrect results in some cases.
      • Fixed a NullPointerException bug that was thrown during advanced aggregation operations like grouping sets.
  • Databricks Runtime 4.1 ML (Beta) (deprecated)

    • July 31, 2018

      • Added Azure SQL DW connector to ML Runtime 4.1
      • Fixed a bug that could cause incorrect query results when the name of a partition column used in a predicate differs from the case of that column in the schema of the table.
      • Fixed a bug affecting Spark SQL execution engine.
      • Fixed a bug affecting code generation.
      • Fixed a bug (java.lang.NoClassDefFoundError) affecting Delta Lake.
      • Improved error handling in Delta Lake.
      • Fixed a bug that caused incorrect data skipping statistics to be collected for string columns 32 characters or greater.
  • Databricks Runtime 4.1 (deprecated)

    • Jan 8, 2019

      • [SPARK-26366]ReplaceExceptWithFilter should consider NULL as False.
      • Delta Lake is enabled.
    • Dec 18, 2018

      • [SPARK-25002]Avro: revise the output record namespace.
      • Fixed an issue affecting certain queries using Join and Limit.
      • [SPARK-26307]Fixed CTAS when INSERT a partitioned table using Hive SerDe.
      • Only ignore corrupt files after one or more retries when spark.sql.files.ignoreCorruptFiles or spark.sql.files.ignoreMissingFiles flag is enabled.
      • Fixed an issue affecting installing Python Wheels in environments without Internet access.
      • Fixed an issue in PySpark that caused DataFrame actions failed with “connection refused” error.
      • Fixed an issue affecting certain self union queries.
    • Nov 20, 2018

      • [SPARK-17916][SPARK-25241]Fix empty string being parsed as null when nullValue is set.
      • Fixed an issue affecting certain aggregation queries with Left Semi/Anti joins.
    • Nov 6, 2018

      • [SPARK-25741]Long URLs are not rendered properly in web UI.
      • [SPARK-25714]Fix Null Handling in the Optimizer rule BooleanSimplification.
    • Oct 9, 2018

      • Fixed a bug affecting the output of running SHOW CREATE TABLE on Delta Lake tables.
      • Fixed a bug affecting Union operation.
    • Sep 25, 2018

      • [SPARK-25368][SQL] Incorrect constraint inference returns wrong result.
      • [SPARK-25402][SQL] Null handling in BooleanSimplification.
      • Fixed NotSerializableException in Avro data source.
    • Sep 11, 2018

      • [SPARK-25214][SS] Fix the issue that Kafka v2 source may return duplicated records when failOnDataLoss=false.
      • [SPARK-24987][SS] Fix Kafka consumer leak when no new offsets for TopicPartition.
      • Filter reduction should handle null value correctly.
    • Aug 28, 2018

      • Fixed a bug in Delta Lake Delete command that would incorrectly delete the rows where the condition evaluates to null.
      • [SPARK-25084]“distribute by” on multiple columns (wrap in brackets) may lead to codegen issue.
      • [SPARK-25114]Fix RecordBinaryComparator when subtraction between two words is divisible by Integer.MAX_VALUE.
    • Aug 23, 2018

      • Fixed NoClassDefError for Delta Snapshot.
      • [SPARK-24957][SQL] Average with decimal followed by aggregation returns wrong result. The incorrect results of AVERAGE might be returned. The CAST added in the Average operator will be bypassed if the result of Divide is the same type which it is casted to.
      • Fixed nullable map issue in Parquet reader.
      • [SPARK-24934][SQL] Explicitly whitelist supported types in upper/lower bounds for in-memory partition pruning. When complex data types are used in query filters against cached data, Spark always returns an empty result set. The in-memory stats-based pruning generates incorrect results, because null is set for upper/lower bounds for complex types. The fix is to not use in-memory stats-based pruning for complex types.
      • [SPARK-25081]Fixed a bug where ShuffleExternalSorter may access a released memory page when spilling fails to allocate memory.
      • Fixed an interaction between Databricks Delta and Pyspark which could cause transient read failures.
      • Fixed secret manager redaction when command partially succeed
    • Aug 2, 2018

      • [SPARK-24613][SQL] Cache with UDF could not be matched with subsequent dependent caches. Wraps the logical plan with a AnalysisBarrier for execution plan compilation in CacheManager, in order to avoid the plan being analyzed again. This is also a regression of Spark 2.3.
      • Fixed a SQL Data Warehouse connector issue affecting timezone conversion for writing DateType data.
      • Fixed an issue affecting Delta checkpointing.
      • Fixed an issue that could cause mergeInto command to produce incorrect results.
      • [SPARK-24867][SQL] Add AnalysisBarrier to DataFrameWriter. SQL cache is not being used when using DataFrameWriter to write a DataFrame with UDF. This is a regression caused by the changes we made in AnalysisBarrier, since not all the Analyzer rules are idempotent.
      • [SPARK-24809]Serializing LongHashedRelation in executor may result in data error.
    • July 11, 2018

      • Fixed a bug in query execution that would cause aggregations on decimal columns with different precisions to return incorrect results in some cases.
      • Fixed a NullPointerException bug that was thrown during advanced aggregation operations like grouping sets.
    • June 28, 2018

      • Fixed a bug that could cause incorrect query results when the name of a partition column used in a predicate differs from the case of that column in the schema of the table.
    • May 29, 2018

      • Fixed a bug affecting Spark SQL execution engine.
      • Fixed a bug affecting code generation.
      • Fixed a bug (java.lang.NoClassDefFoundError) affecting Delta Lake.
      • Improved error handling in Delta Lake.
    • May 15, 2018

      • Fixed a bug that caused incorrect data skipping statistics to be collected for string columns 32 characters or greater.
  • Databricks Runtime 4.0 (deprecated)

    • Nov 6, 2018

      • [SPARK-25714]Fix Null Handling in the Optimizer rule BooleanSimplification.
    • Oct 9, 2018

      • Fixed a bug affecting Union operation.
    • Sep 25, 2018

      • [SPARK-25368][SQL] Incorrect constraint inference returns wrong result.
      • [SPARK-25402][SQL] Null handling in BooleanSimplification.
      • Fixed NotSerializableException in Avro data source.
    • Sep 11, 2018

      • Filter reduction should handle null value correctly.
    • Aug 28, 2018

      • Fixed a bug in Delta Lake Delete command that would incorrectly delete the rows where the condition evaluates to null.
    • Aug 23, 2018

      • Fixed nullable map issue in Parquet reader.
      • Fixed secret manager redaction when command partially succeed
      • Fixed an interaction between Databricks Delta and Pyspark which could cause transient read failures.
      • [SPARK-25081]Fixed a bug where ShuffleExternalSorter may access a released memory page when spilling fails to allocate memory.
      • [SPARK-25114]Fix RecordBinaryComparator when subtraction between two words is divisible by Integer.MAX_VALUE.
    • Aug 2, 2018

      • [SPARK-24452]Avoid possible overflow in int add or multiple.
      • [SPARK-24588]Streaming join should require HashClusteredPartitioning from children.
      • Fixed an issue that could cause mergeInto command to produce incorrect results.
      • [SPARK-24867][SQL] Add AnalysisBarrier to DataFrameWriter. SQL cache is not being used when using DataFrameWriter to write a DataFrame with UDF. This is a regression caused by the changes we made in AnalysisBarrier, since not all the Analyzer rules are idempotent.
      • [SPARK-24809]Serializing LongHashedRelation in executor may result in data error.
    • June 28, 2018

      • Fixed a bug that could cause incorrect query results when the name of a partition column used in a predicate differs from the case of that column in the schema of the table.
    • May 31, 2018

      • Fixed a bug affecting Spark SQL execution engine.
      • Improved error handling in Delta Lake.
    • May 17, 2018

      • Bug fixes for Databricks secret management.
      • Improved stability on reading data stored in Azure Data Lake Store.
      • Fixed a bug affecting RDD caching.
      • Fixed a bug affecting Null-safe Equal in Spark SQL.
    • Apr 24, 2018

      • Upgraded Azure Data Lake Store SDK from 2.0.11 to 2.2.8 to improve the stability of access to Azure Data Lake Store.
      • Fixed a bug affecting the insertion of overwrites to partitioned Hive tables when spark.databricks.io.hive.fastwriter.enabled is false.
      • Fixed an issue that failed task serialization.
      • Improved Delta Lake stability.
    • Mar 14, 2018

      • Prevent unnecessary metadata updates when writing into Delta Lake.
      • Fixed an issue caused by a race condition that could, in rare circumstances, lead to loss of some output files.
  • Databricks Runtime 3.5-LTS

    • Apr 9, 2019

      • [SPARK-26665][CORE] Fix a bug that can cause BlockTransferService.fetchBlockSync to hang forever.
    • Feb 12, 2019

      • Fixed an issue that Spark low level network protocol may be broken when sending large RPC error messages with encryption enabled (in HIPAA-Compliant Deployment) or when spark.network.crypto.enabled is set to true).
    • Jan 30, 2019

      • Fixed an issue that can cause df.rdd.count() with UDT to return incorrect answer for certain cases.
    • Dec 18, 2018

      • Only ignore corrupt files after one or more retries when spark.sql.files.ignoreCorruptFiles or spark.sql.files.ignoreMissingFiles flag is enabled.
      • Fixed an issue affecting certain self union queries.
    • Nov 20, 2018

    • Nov 6, 2018

      • [SPARK-25714]Fix Null Handling in the Optimizer rule BooleanSimplification.
    • Oct 9, 2018

      • Fixed a bug affecting Union operation.
    • Sep 25, 2018

      • [SPARK-25402][SQL] Null handling in BooleanSimplification.
      • Fixed NotSerializableException in Avro data source.
    • Sep 11, 2018

      • Filter reduction should handle null value correctly.
    • Aug 28, 2018

      • Fixed a bug in Delta Lake Delete command that would incorrectly delete the rows where the condition evaluates to null.
      • [SPARK-25114]Fix RecordBinaryComparator when subtraction between two words is divisible by Integer.MAX_VALUE.
    • Aug 23, 2018

      • [SPARK-24809]Serializing LongHashedRelation in executor may result in data error.
      • Fixed nullable map issue in Parquet reader.
      • [SPARK-25081]Fixed a bug where ShuffleExternalSorter may access a released memory page when spilling fails to allocate memory.
      • Fixed an interaction between Databricks Delta and Pyspark which could cause transient read failures.
    • June 28, 2018

      • Fixed a bug that could cause incorrect query results when the name of a partition column used in a predicate differs from the case of that column in the schema of the table.
    • June 28, 2018

      • Fixed a bug that could cause incorrect query results when the name of a partition column used in a predicate differs from the case of that column in the schema of the table.
    • May 31, 2018

      • Fixed a bug affecting Spark SQL execution engine.
      • Improved error handling in Delta Lake.
    • May 17, 2018

      • Improved stability on reading data stored in Azure Data Lake Store.
      • Fixed a bug affecting RDD caching.
      • Fixed a bug affecting Null-safe Equal in Spark SQL.
      • Fixed a bug affecting certain aggregations in streaming queries.
    • Apr 24, 2018

      • Upgraded Azure Data Lake Store SDK from 2.0.11 to 2.2.8 to improve the stability of access to Azure Data Lake Store.
      • Fixed a bug affecting the insertion of overwrites to partitioned Hive tables when spark.databricks.io.hive.fastwriter.enabled is false.
      • Fixed an issue that failed task serialization.
    • Mar 09, 2018

      • Fixed an issue caused by a race condition that could, in rare circumstances, lead to loss of some output files.
    • Mar 01, 2018

      • Improved the efficiency of handling streams that can take a long time to stop.
      • Fixed an issue affecting Python autocomplete.
      • Applied Ubuntu security patches.
      • Fixed an issue affecting certain queries using Python UDFs and window functions.
      • Fixed an issue affecting the use of UDFs on a cluster with table access control enabled.
    • Jan 29, 2018

      • Fixed an issue affecting the manipulation of tables stored in Azure Blob Storage.
      • Fixed aggregation after dropDuplicates on empty DataFrame.
  • Databricks Runtime 3.4 (deprecated)

    • May 31, 2018

      • Fixed a bug affecting Spark SQL execution engine.
      • Improved error handling in Delta Lake.
    • May 17, 2018

      • Improved stability on reading data stored in Azure Data Lake Store.
      • Fixed a bug affecting RDD caching.
      • Fixed a bug affecting Null-safe Equal in Spark SQL.
    • Apr 24, 2018

      • Fixed a bug affecting the insertion of overwrites to partitioned Hive tables when spark.databricks.io.hive.fastwriter.enabled is false.
    • Mar 09, 2018

      • Fixed an issue caused by a race condition that could, in rare circumstances, lead to loss of some output files.
    • Dec 13, 2017

      • Fixed an issue affecting UDFs in Scala.
      • Fixed an issue affecting the use of Data Skipping Index on data source tables stored in non-DBFS paths.
    • Dec 07, 2017

      • Improved shuffle stability.
  • Databricks Runtime 3.3 (deprecated)

    • May 31, 2018

      • Fixed a bug affecting Spark SQL execution engine.
    • Apr 24, 2018

      • Fixed a bug affecting the insertion of overwrites to partitioned Hive tables when spark.databricks.io.hive.fastwriter.enabled is false.
    • Mar 12, 2018

      • Fixed an issue caused by a race condition that could, in rare circumstances, lead to loss of some output files.
    • Jan 29, 2018

      • Fixed an issue affecting UDFs in Scala.
    • Oct 11, 2017

      • Improved shuffle stability.
  • Databricks Runtime 3.2 (deprecated)

    • Mar 30, 2018

      • Fixed an issue caused by a race condition that could, in rare circumstances, lead to loss of some output files.
    • Sep 13, 2017

      • Fixed an issue affecting the use of spark_submit_task with Databricks jobs.
    • Sep 06, 2017

      • Fixed an issue affecting the performance of certain window functions.
  • 2.1.1-db6 Cluster Image (deprecated)

    • May 31, 2018

      • Fixed a bug affecting Spark SQL execution engine.
    • Mar 30, 2018

      • Fixed an issue caused by a race condition that could, in rare circumstances, lead to loss of some output files.
  • 2.1.1-db4 Cluster Image (deprecated)

    • May 31, 2018

      • Fixed a bug affecting Spark SQL execution engine.
    • Mar 30, 2018

      • Fixed an issue caused by a race condition that could, in rare circumstances, lead to loss of some output files.