2.0.2-db1 cluster image

Databricks released this image in November, 2016.

Important

This release has been deprecated. For more information about the Databricks Runtime deprecation policy and schedule, see Databricks runtime support lifecycle.

The following release notes provide information about the Spark 2.0.2-db1 cluster image powered by Apache Spark.

Apache Spark

2.0.2-db1 cluster image includes Apache Spark 2.0.2. Apache Spark 2.0.2 contains stability fixes, Apache Kafka support for Structured Streaming, and improved metrics for Structured Streaming. For more information, please see Apache Spark 2.0.2 release notes. 2.0.2-db1 cluster image also includes the following extra bug fixes and improvements:

Changes and Improvements

Note

Starting from this version, users can enable Spark Session Isolation when creating clusters. With Spark Session Isolation, different notebooks attached to a cluster are in different sessions with isolated runtime configurations and current database setting.

  • Operating system upgraded to Ubuntu 16.04.1 LTS from Ubuntu 15.10.

  • Java upgraded to 1.8.0_111 from 1.8.0_66-internal.

  • Python upgraded to 2.7.12 from 2.7.10.

  • Most pre-installed Python libraries upgraded. For the list of Python libraries and their version, see Pre-installed Python Libraries.

  • Fixed issues around unicode characters in Python notebooks

  • Performance improvement (better pipelining) on scanning files in S3.

  • Performance improvement on queries using percentile_approx.

  • Users can set spark.databricks.session.share to false to enable spark session isolation. With Spark Session Isolation, different notebooks attached to a cluster are in different sessions with isolated runtime configurations and current database setting. For details, see Spark Session Isolation.

  • Databricks redirects executor Log4j logs to stderr. Users can access stderr on the executor page and every worker page of the cluster. Worker pages will not show the log4j link and executors will not write logs to log4j files.

Executor logs to stderr 1
Executor logs to stderr 2

System Environment

  • Operating System: Ubuntu 16.04.1 LTS

  • Java: 1.8.0_111

  • Scala: 2.10.6 (Scala 2.10 cluster version)/2.11.8 (Scala 2.11 cluster version)

  • Python: 2.7.12

  • R: R version 3.2.3 (2015-12-10)

Pre-installed Python Libraries

Library

Version

Library

Version

Library

Version

ansi2html

1.1.1

argparse

1.2.1

boto

2.42.0

boto3

1.4.1

botocore

1.4.70

brewer2mpl

1.4.1

certifi

2016.2.28

cffi

1.7.0

chardet

2.3.0

colorama

0.3.7

configobj

5.0.6

cryptography

1.5

cycler

0.10.0

Cython

0.24.1

decorator

4.0.10

docutils

0.12

enum34

1.1.6

et-xmlfile

1.0.1

freetype-py

1.0.2

funcsigs

1.0.2

fusepy

2.0.4

futures

3.0.5

ggplot

0.6.8

html5lib

0.999

idna

2.1

ipaddress

1.0.16

ipython

2.2.0

ipython-genutils

0.1.0

jdcal

1.2

Jinja2

2.8

jmespath

0.9.0

llvmlite

0.13.0

lxml

3.6.4

MarkupSafe

0.23

matplotlib

1.5.3

mpld3

0.2

msgpack-python

0.4.7

ndg-httpsclient

0.3.3

numba

0.28.1

numpy

1.11.1

openpyxl

2.3.2

pandas

0.18.1

pathlib2

2.1.0

patsy

0.4.1

pexpect

4.0.1

pickleshare

0.7.4

Pillow

3.3.1

pip

8.1.2

pkg_resources

0.0.0

ply

3.9

prompt-toolkit

1.0.7

psycopg2

2.6.2

ptyprocess

0.5.1

py4j

0.10.3

pyasn1

0.1.9

pycparser

2.14

Pygments

2.1.3

PyGObject

3.20.0

pyOpenSSL

16.0.0

pyparsing

2.1.4

pypng

0.0.18

Python

2.7.12

python-dateutil

2.5.3

python-geohash

0.8.5

pytz

2016.6.1

requests

2.11.1

s3transfer

0.1.9

scikit-learn

0.17.1

scipy

0.18.1

scour

0.32

seaborn

0.7.1

setuptools

28.6.0

simplejson

3.8.2

simples3

1.0

singledispatch

3.4.0.3

six

1.10.0

statsmodels

0.6.1

traitlets

4.3.0

urllib3

1.19.1

virtualenv

15.0.1

wcwidth

0.1.7

wheel

0.30.0a0

wsgiref

0.1.2

Pre-installed R Libraries

Library

Version

Library

Version

Library

Version

abind

1.4-3

assertthat

0.1

base

3.2.3

BH

1.60.0-2

bitops

1.0-6

boot

1.3-17

brew

1.0-6

car

2.1-3

caret

6.0-71

chron

2.3-47

class

7.3-14

cluster

2.0.5

codetools

0.2-14

colorspace

1.2-4

compiler

3.2.3

crayon

1.3.1

curl

2.2

data.table

1.9.6

datasets

3.2.3

DBI

0.5-1

devtools

1.12.0

dichromat

2.0-0

digest

0.6.9

doMC

1.3.4

dplyr

0.5.0

foreach

1.4.3

foreign

0.8-66

gbm

2.1.1

ggplot2

2.1.0

git2r

0.15.0

glmnet

2.0-5

graphics

3.2.3

grDevices

3.2.3

grid

3.2.3

gsubfn

0.6-6

gtable

0.1.2

h2o

3.10.0.8

httr

1.2.1

hwriter

1.3.2

hwriterPlus

1.0-3

iterators

1.0.8

jsonlite

1.1

KernSmooth

2.23-15

labeling

0.3

lattice

0.20-34

lazyeval

0.2.0

littler

0.3.0

lme4

1.1-12

lubridate

1.6.0

magrittr

1.5

mapproj

1.2-4

maps

3.0.2

MASS

7.3-45

Matrix

1.2-7.1

MatrixModels

0.4-1

memoise

1.0.0

methods

3.2.3

mgcv

1.8-11

mime

0.5

minqa

1.2.4

multicore

0.2

munsell

0.4.2

mvtnorm

1.0-5

nlme

3.1-124

nloptr

1.0.4

nnet

7.3-12

openssl

0.9.4

parallel

3.2.3

pbkrtest

0.4-6

pkgKitten

0.1.3

plyr

1.8.4

praise

1.0.0

pROC

1.8

proto

0.3-10

quantreg

5.29

R.methodsS3

1.7.1

R.oo

1.20.0

R.utils

2.4.0

R6

2.2.0

randomForest

4.6-12

RColorBrewer

1.1-2

Rcpp

0.12.7

RcppEigen

0.3.2.9.0

RCurl

1.95-4.8

reshape2

1.4.2

RODBC

1.3-12

roxygen2

5.0.1

rpart

4.1-10

Rserve

1.7-3

RSQLite

1.0.0

rstudioapi

0.6

scales

0.3.0

sp

1.0-15

SparkR

2.0.2

SparseM

1.72

spatial

7.3-11

splines

3.2.3

sqldf

0.4-10

statmod

1.4.26

stats

3.2.3

stats4

3.2.3

stringi

1.0-1

stringr

1.0.0

survival

2.38-3

tcltk

3.2.3

TeachingDemos

2.10

testthat

1.0.2

tibble

1.2

tools

3.2.3

utils

3.2.3

whisker

0.3-2

withr

1.0.2