Range join optimization

A range join occurs when two relations join using a point in interval or interval overlap condition. Using the range join optimization in Databricks Runtime can greatly improve query performance.

In Databricks SQL, Databricks automatically optimizes range joins without any manual configuration. You can also manually tune range joins using join hints or session configuration for all compute types.

Point in interval range join

A point in interval range join is a join whose condition contains predicates specifying that a value from one relation falls between two values from the other relation. For example:

SQL
-- using BETWEEN expressions
SELECT *
FROM points JOIN ranges ON points.p BETWEEN ranges.start and ranges.end;

-- using inequality expressions
SELECT *
FROM points JOIN ranges ON points.p >= ranges.start AND points.p < ranges.end;

-- with fixed length interval
SELECT *
FROM points JOIN ranges ON points.p >= ranges.start AND points.p < ranges.start + 100;

-- join two sets of point values within a fixed distance from each other
SELECT *
FROM points1 p1 JOIN points2 p2 ON p1.p >= p2.p - 10 AND p1.p <= p2.p + 10;

-- a range condition together with other join conditions
SELECT *
FROM points, ranges
WHERE points.symbol = ranges.symbol
  AND points.p >= ranges.start
  AND points.p < ranges.end;

Interval overlap range join

An interval overlap range join is a join in which the condition contains predicates specifying an overlap of intervals between two values from each relation. For example:

SQL
-- overlap of [r1.start, r1.end] with [r2.start, r2.end]
SELECT *
FROM r1 JOIN r2 ON r1.start < r2.end AND r2.start < r1.end;

-- overlap of fixed length intervals
SELECT *
FROM r1 JOIN r2 ON r1.start < r2.start + 100 AND r2.start < r1.start + 100;

-- a range condition together with other join conditions
SELECT *
FROM r1 JOIN r2 ON r1.symbol = r2.symbol
  AND r1.start <= r2.end
  AND r1.end >= r2.start;

Range join optimization

The range join optimization is performed for joins that:

Have a condition that can be interpreted as a point in interval or interval overlap range join.
All values involved in the range join condition are of a numeric type (integral, floating point, decimal), DATE, or TIMESTAMP.
All values involved in the range join condition are of the same type. In the case of the decimal type, the values also must be of the same scale and precision.
It is an INNER JOIN, or in case of point in interval range join, a LEFT OUTER JOIN with point value on the left side, or RIGHT OUTER JOIN with point value on the right side.
Have a bin size, either automatically derived or manually specified.

Joins with numeric equality and range conditions

When a join condition includes both an equality condition on a numeric column and a range condition, the optimizer may apply binning to the numeric equality column because it meets the type requirements for range join optimization. This can result in the equality column being assigned to bins or excluded from the optimization, reducing performance.

To ensure the range join optimization applies only to the intended range condition, cast the numeric equality columns to STRING. This excludes them from consideration as range condition columns.

SQL
SELECT /*+ RANGE_JOIN(reference, 3306084) */
    reference.*, position.*
FROM position
INNER JOIN reference
    ON CAST(position.parent_index AS STRING) = CAST(reference.parent_index AS STRING)
    AND position.child_index BETWEEN reference.min_child_index AND reference.max_child_index;

The same pattern applies to other numeric columns used as equality keys, such as DATE, integer identifiers, or clustered partition columns.

Bin size

The bin size is a numeric tuning parameter that splits the values domain of the range condition into multiple bins of equal size. For example, with a bin size of 10, the optimization splits the domain into bins that are intervals of length 10. If you have a point in range condition of p BETWEEN start AND end, and start is 8 and end is 22, this value interval overlaps with three bins of length 10 – the first bin from 0 to 10, the second bin from 10 to 20, and the third bin from 20 to 30. Only the points that fall within the same three bins need to be considered as possible join matches for that interval. For example, if p is 32, it can be ruled out as falling between start of 8 and end of 22, because it falls in the bin from 30 to 40.

note

For DATE values, the value of the bin size is interpreted as days. For example, a bin size value of 7 represents a week.
For TIMESTAMP values, the value of the bin size is interpreted as seconds. If a sub-second value is required, fractional values can be used. For example, a bin size value of 60 represents a minute, and a bin size value of 0.1 represents 100 milliseconds.

You can specify the bin size by using a range join hint in the query or by setting a session configuration parameter. In Databricks SQL, the bin size is automatically derived when automatic range join optimization is enabled.

Automatic range join optimization

In Databricks SQL, Databricks automatically detects qualifying range joins and derives the optimal bin size by sampling the intervals table. This removes the need to manually specify a bin size through hints or session configuration.

Automatic range join optimization is enabled by default in Databricks SQL. To disable it, set the following configuration:

SQL
SET spark.databricks.optimizer.autoRangeJoin.enabled = false;

If you specify a bin size through a range join hint or session configuration, that value overrides the automatically derived bin size.

Enable range join using a range join hint

To enable the range join optimization in a SQL query, use a range join hint to specify the bin size. The hint must contain the relation name of one of the joined relations and the numeric bin size parameter. The relation name can be a table, a view, or a subquery.

SQL
SELECT /*+ RANGE_JOIN(points, 10) */ *
FROM points JOIN ranges ON points.p >= ranges.start AND points.p < ranges.end;

SELECT /*+ RANGE_JOIN(r1, 0.1) */ *
FROM (SELECT * FROM ranges WHERE ranges.amount < 100) r1, ranges r2
WHERE r1.start < r2.start + 100 AND r2.start < r1.start + 100;

SELECT /*+ RANGE_JOIN(c, 500) */ *
FROM a
  JOIN b ON (a.b_key = b.id)
  JOIN c ON (a.ts BETWEEN c.start_time AND c.end_time)

note

In the third example, you must place the hint on c. This is because joins are left associative, so the query is interpreted as (a JOIN b) JOIN c, and the hint on a applies to the join of a with b and not the join with c.

Python
#create minute table
minutes = spark.createDataFrame(
    [(0, 60), (60, 120)],
    "minute_start: int, minute_end: int"
)

#create events table
events = spark.createDataFrame(
    [(12, 33), (0, 120), (33, 72), (65, 178)],
    "event_start: int, event_end: int"
)

#Range_Join with "hint" on the from table
(events.hint("range_join", 60)
  .join(minutes,
    on=[events.event_start < minutes.minute_end,
    minutes.minute_start < events.event_end])
  .orderBy(events.event_start,
    events.event_end,
    minutes.minute_start)
  .show()
)

#Range_Join with "hint" on the join table
(events.join(minutes.hint("range_join", 60),
  on=[events.event_start < minutes.minute_end,
    minutes.minute_start < events.event_end])
  .orderBy(events.event_start,
    events.event_end,
    minutes.minute_start)
  .show()
)

You can also place a range join hint on one of the joined DataFrames. In that case, the hint contains just the numeric bin size parameter.

Scala
val df1 = spark.table("ranges").as("left")
val df2 = spark.table("ranges").as("right")

val joined = df1.hint("range_join", 10)
  .join(df2, $"left.type" === $"right.type" &&
     $"left.end" > $"right.start" &&
     $"left.start" < $"right.end")

val joined2 = df1
  .join(df2.hint("range_join", 0.5), $"left.type" === $"right.type" &&
     $"left.end" > $"right.start" &&
     $"left.start" < $"right.end")

Enable range join using session configuration

If you don't want to modify the query, specify the bin size as a configuration parameter.

SQL
SET spark.databricks.optimizer.rangeJoin.binSize=5

This configuration parameter applies to any join with a range condition. However, a different bin size set through a range join hint always overrides the one set through the parameter.

Choose the bin size

The effectiveness of the range join optimization depends on choosing the appropriate bin size.

A small bin size results in a larger number of bins, which helps in filtering the potential matches. However, it becomes inefficient if the bin size is significantly smaller than the encountered value intervals, and the value intervals overlap multiple bin intervals. For example, with a condition p BETWEEN start AND end, where start is 1,000,000 and end is 1,999,999, and a bin size of 10, the value interval overlaps with 100,000 bins.

If the length of the interval is fairly uniform and known, we recommend that you set the bin size to the typical expected length of the value interval. However, if the length of the interval is varying and skewed, a balance must be found to set a bin size that filters the short intervals efficiently, while preventing the long intervals from overlapping too many bins. Assuming a table ranges, with intervals that are between columns start and end, you can determine different percentiles of the skewed interval length value with the following query:

SQL
SELECT
  map_from_arrays(
    ARRAY(0.5, 0.9, 0.99, 0.999, 0.9999),
    APPROX_PERCENTILE(
      end::DOUBLE - start::DOUBLE,
      ARRAY(0.5, 0.9, 0.99, 0.999, 0.9999)
    )
  ) AS bin_sizes
FROM
  ranges;

Casting each column to DOUBLE before subtracting ensures the query works whether the columns are numeric, DATE, or TIMESTAMP values.

A recommended setting of bin size would be the maximum of the value at the 90th percentile, or the value at the 99th percentile divided by 10, or the value at the 99.9th percentile divided by 100, etc. The rationale is:

If the value at the 90th percentile is the bin size, only 10% of the value interval lengths are longer than the bin interval, so span more than 2 adjacent bin intervals.
If the value at the 99th percentile is the bin size, only 1% of the value interval lengths span more than 11 adjacent bin intervals.
If the value at the 99.9th percentile is the bin size, only 0.1% of the value interval lengths span more than 101 adjacent bin intervals.
The same can be repeated for the values at the 99.99th, the 99.999th percentile, etc. if needed.

The described method limits the amount of skewed long value intervals that overlap multiple bin intervals. The bin size value obtained this way is only a starting point for fine-tuning; actual results might depend on the specific workload.

Point in interval range join​

Interval overlap range join​

Range join optimization​

Joins with numeric equality and range conditions​

Bin size​

Automatic range join optimization​

Enable range join using a range join hint​

Enable range join using session configuration​

Choose the bin size​

Point in interval range join

Interval overlap range join

Range join optimization

Joins with numeric equality and range conditions

Bin size

Automatic range join optimization

Enable range join using a range join hint

Enable range join using session configuration

Choose the bin size