2.0.2-db1 Cluster Image

Databricks released this image in November, 2016.

Important

This release has been deprecated. For more information about the Databricks Runtime deprecation policy and schedule, see Databricks Runtime Versioning and Support Lifecycle.

The following release notes provide information about the Spark 2.0.2-db1 cluster image powered by Apache Spark.

Apache Spark

2.0.2-db1 cluster image includes Apache Spark 2.0.2. Apache Spark 2.0.2 contains stability fixes, Apache Kafka support for Structured Streaming, and improved metrics for Structured Streaming. For more information, please see Apache Spark 2.0.2 release notes. 2.0.2-db1 cluster image also includes the following extra bug fixes and improvements:

Changes and Improvements

Note

Starting from this version, users can enable Spark Session Isolation when creating clusters. With Spark Session Isolation, different notebooks attached to a cluster are in different sessions with isolated runtime configurations and current database setting.

  • Operating system upgraded to Ubuntu 16.04.1 LTS from Ubuntu 15.10.
  • Java upgraded to 1.8.0_111 from 1.8.0_66-internal.
  • Python upgraded to 2.7.12 from 2.7.10.
  • Most pre-installed Python libraries upgraded. Please refer to Pre-installed Python Libraries for the list of Python libraries and their versions.
  • Fixed issues around unicode characters in Python notebooks
  • Performance improvement (better pipelining) on scanning files in S3.
  • Performance improvement on queries using percentile_approx.
  • Users can set spark.databricks.session.share to false to enable spark session isolation. With Spark Session Isolation, different notebooks attached to a cluster are in different sessions with isolated runtime configurations and current database setting. Please refer to the section of Spark Session Isolation for details.
  • Databricks redirects executor Log4j logs to stderr. Users can access stderr on the executor page and every worker page of the cluster. Worker pages will not show the log4j link and executors will not write logs to log4j files.
../../_images/log4j-1.png
../../_images/log4j-2.png

System Environment

  • Operating System: Ubuntu 16.04.1 LTS
  • Java: 1.8.0_111
  • Scala: 2.10.6 (Scala 2.10 cluster version)/2.11.8 (Scala 2.11 cluster version)
  • Python: 2.7.12
  • R: R version 3.2.3 (2015-12-10)

Pre-installed Python Libraries

Library Version Library Version Library Version
ansi2html 1.1.1 argparse 1.2.1 boto 2.42.0
boto3 1.4.1 botocore 1.4.70 brewer2mpl 1.4.1
certifi 2016.2.28 cffi 1.7.0 chardet 2.3.0
colorama 0.3.7 configobj 5.0.6 cryptography 1.5
cycler 0.10.0 Cython 0.24.1 decorator 4.0.10
docutils 0.12 enum34 1.1.6 et-xmlfile 1.0.1
freetype-py 1.0.2 funcsigs 1.0.2 fusepy 2.0.4
futures 3.0.5 ggplot 0.6.8 html5lib 0.999
idna 2.1 ipaddress 1.0.16 ipython 2.2.0
ipython-genutils 0.1.0 jdcal 1.2 Jinja2 2.8
jmespath 0.9.0 llvmlite 0.13.0 lxml 3.6.4
MarkupSafe 0.23 matplotlib 1.5.3 mpld3 0.2
msgpack-python 0.4.7 ndg-httpsclient 0.3.3 numba 0.28.1
numpy 1.11.1 openpyxl 2.3.2 pandas 0.18.1
pathlib2 2.1.0 patsy 0.4.1 pexpect 4.0.1
pickleshare 0.7.4 Pillow 3.3.1 pip 8.1.2
pkg_resources 0.0.0 ply 3.9 prompt-toolkit 1.0.7
psycopg2 2.6.2 ptyprocess 0.5.1 py4j 0.10.3
pyasn1 0.1.9 pycparser 2.14 Pygments 2.1.3
PyGObject 3.20.0 pyOpenSSL 16.0.0 pyparsing 2.1.4
pypng 0.0.18 Python 2.7.12 python-dateutil 2.5.3
python-geohash 0.8.5 pytz 2016.6.1 requests 2.11.1
s3transfer 0.1.9 scikit-learn 0.17.1 scipy 0.18.1
scour 0.32 seaborn 0.7.1 setuptools 28.6.0
simplejson 3.8.2 simples3 1.0 singledispatch 3.4.0.3
six 1.10.0 statsmodels 0.6.1 traitlets 4.3.0
urllib3 1.19.1 virtualenv 15.0.1 wcwidth 0.1.7
wheel 0.30.0a0 wsgiref 0.1.2    

Pre-installed R Libraries

Library Version Library Version Library Version
abind 1.4-3 assertthat 0.1 base 3.2.3
BH 1.60.0-2 bitops 1.0-6 boot 1.3-17
brew 1.0-6 car 2.1-3 caret 6.0-71
chron 2.3-47 class 7.3-14 cluster 2.0.5
codetools 0.2-14 colorspace 1.2-4 compiler 3.2.3
crayon 1.3.1 curl 2.2 data.table 1.9.6
datasets 3.2.3 DBI 0.5-1 devtools 1.12.0
dichromat 2.0-0 digest 0.6.9 doMC 1.3.4
dplyr 0.5.0 foreach 1.4.3 foreign 0.8-66
gbm 2.1.1 ggplot2 2.1.0 git2r 0.15.0
glmnet 2.0-5 graphics 3.2.3 grDevices 3.2.3
grid 3.2.3 gsubfn 0.6-6 gtable 0.1.2
h2o 3.10.0.8 httr 1.2.1 hwriter 1.3.2
hwriterPlus 1.0-3 iterators 1.0.8 jsonlite 1.1
KernSmooth 2.23-15 labeling 0.3 lattice 0.20-34
lazyeval 0.2.0 littler 0.3.0 lme4 1.1-12
lubridate 1.6.0 magrittr 1.5 mapproj 1.2-4
maps 3.0.2 MASS 7.3-45 Matrix 1.2-7.1
MatrixModels 0.4-1 memoise 1.0.0 methods 3.2.3
mgcv 1.8-11 mime 0.5 minqa 1.2.4
multicore 0.2 munsell 0.4.2 mvtnorm 1.0-5
nlme 3.1-124 nloptr 1.0.4 nnet 7.3-12
openssl 0.9.4 parallel 3.2.3 pbkrtest 0.4-6
pkgKitten 0.1.3 plyr 1.8.4 praise 1.0.0
pROC 1.8 proto 0.3-10 quantreg 5.29
R.methodsS3 1.7.1 R.oo 1.20.0 R.utils 2.4.0
R6 2.2.0 randomForest 4.6-12 RColorBrewer 1.1-2
Rcpp 0.12.7 RcppEigen 0.3.2.9.0 RCurl 1.95-4.8
reshape2 1.4.2 RODBC 1.3-12 roxygen2 5.0.1
rpart 4.1-10 Rserve 1.7-3 RSQLite 1.0.0
rstudioapi 0.6 scales 0.3.0 sp 1.0-15
SparkR 2.0.2 SparseM 1.72 spatial 7.3-11
splines 3.2.3 sqldf 0.4-10 statmod 1.4.26
stats 3.2.3 stats4 3.2.3 stringi 1.0-1
stringr 1.0.0 survival 2.38-3 tcltk 3.2.3
TeachingDemos 2.10 testthat 1.0.2 tibble 1.2
tools 3.2.3 utils 3.2.3 whisker 0.3-2
withr 1.0.2