Histograms in Databricks notebooks

A histogram plots the frequency that a given value occurs in a dataset. A histogram helps you to understand whether a dataset has values that are clustered around a small number of ranges or are more spread out. A histogram is displayed as a bar chart in which you control the number of distinct bars (also called bins).

The following histogram displays the values from the table column in the diamonds dataset, using 10 bins:

Histogram example

This article covers the options for histograms.

General

  • X Column: Select the results column from the dataset to display.

  • Number of Bins: Number of bins in which to display the data.

X Axis

  • Scale: Select Automatic, Linear, or Logarithmic.

  • Name: Specify a display name for the X-axis column if different from the column name.

  • Show Labels: Whether to show X-axis labels.

  • Hide Axis: Whether to hide the X-axis labels and line.

Y Axis

  • Name: Specify a display name for the Y-axis column if different from the column name.

  • Min Value, Max Value: Set minimum and maximum values for the Y-axis.

Colors

Optionally override the default color.

Data Labels

Optionally override formatting options.