Databricks supports a number of visualizations out of the box. All notebooks, regardless of their language, support Databricks visualization using the display function.

Additionally, all Databricks programming language notebooks (Python, Scala, R) support interactive HTML graphics using Javascript libraries such as D3; you can pass any HTML, CSS, or JavaScript code to the displayHTML function to render its results. See HTML, D3, and SVG in Notebooks for more information.

display function

The easiest way to create a visualization in Databricks is to call display(<dataframe-name>). For example, if you have a DataFrame diamonds_color of a diamonds dataset, grouped by diamond color and compute the average price, and you call


A table of diamond color versus average price displays.


Click the bar chart icon Chart Button to display a chart of the same information:



If you see OK with no rendering after calling the display function, mostly likely the DataFrame or collection you passed in is empty.

You can click the down arrow next to the bar chart Chart Button to choose another chart type and click Plot Options... to configure the chart.


If you register a DataFrame as a table, you can also query it with SQL to create Visualizations in SQL.

Visualizations in Python

To create a visualization in Python, call display(<dataframe-name>).

dataPath = "/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv"
diamonds ="csv").option("header","true").option("inferSchema", "true").load(dataPath) # Read diamonds dataset and crate DataFrame

diamonds-color = diamonds.groupBy("color").avg("price") # Group by color

You can also display matplotlib and ggplot figures in Databricks. For a demonstration, see Matplotlib and ggplot in Python Notebooks.

Visualizations in R

In addition to the Databricks visualizations, R notebooks can use any R visualization package. The R notebook will capture the resulting plot as a .png and display it inline.

Here’s an example of the default library:

fit <- lm(Petal.Length ~., data = iris)
layout(matrix(c(1,2,3,4),2,2)) # optional 4 graphs/page

Using ggplot:

ggplot(diamonds, aes(carat, price, color = color, group = 1)) + geom_point(alpha = 0.3) + stat_smooth()

Using Lattice:

xyplot(price ~ carat | cut, diamonds, scales = list(log = TRUE), type = c("p", "g", "smooth"), ylab = "Log price")

You can also install and use other plotting libraries.

install.packages("DandEFA", repos = "")
timss2011 <- na.omit(timss2011)
dandpal <- rev(rainbow(100, start = 0, end = 0.2))
facl <- factload(timss2011,nfac=5,method="prax",cormeth="spearman")
facl <- factload(timss2011,nfac=8,method="mle",cormeth="pearson")

Visualizations in Scala

The easiest way to perform plotting in Scala is to use the built-in Databricks visualization modules and the display method. For example:

case class MyCaseClass(key: String, group: String, value: Int)
val dataframe = sc.parallelize(Array(MyCaseClass("f", "consonants", 1),
       MyCaseClass("g", "consonants", 2),
       MyCaseClass("h", "consonants", 3),
       MyCaseClass("i", "vowels", 4),
       MyCaseClass("j", "consonants", 5))


Visualizations in SQL

When you execute SQL you would like to visualize, Databricks automatically extracts some of the data and displays it as a table.

For example, after creating a DataFrame in Scala, you could register it as a temporary table:


Then, query that DataFrame with SQL.

select color, price from diamonds_table

Databricks automatically displays the color and price columns in a table. From there you can select different styles of visualization.