Warehouse events system table reference

Preview

This feature is in Public Preview.

In this article, you learn how to use the warehouse events system table to monitor and manage the SQL warehouses in your workspaces. This table records a row for every time a warehouse starts, stops, runs, and scales up and down. You can use the sample queries in this article with alerts to keep you informed of changes to your warehouses.

The warehouse events system table is located at system.compute.warehouse_events.

Logged warehouse event types

This system table logs the following types of events:

  • SCALED_UP: A new cluster was added to the warehouse.

  • SCALED_DOWN: A cluster was removed from the warehouse.

  • STOPPING: The warehouse is in the process of stopping.

  • RUNNING: The warehouse is actively running.

  • STARTING: The warehouse is in the process of starting up.

  • STOPPED: The warehouse has completely stopped running.

Warehouse events schema

The warehouse_events system table uses the following schema:

Column name

Data type

Description

Example

account_id

string

The ID of the Databricks account.

7af234db-66d7-4db3-bbf0-956098224879

workspace_id

string

The ID of the workspace where the warehouse is deployed.

123456789012345

warehouse_id

string

The ID of SQL warehouse the event is related to.

123456789012345

event_type

string

The type of warehouse event. Possible values are SCALED_UP, SCALED_DOWN, STOPPING, RUNNING, STARTING, and STOPPED.

SCALED_UP

cluster_count

integer

The number of clusters that are actively running.

2

event_time

timestamp

Timestamp of when the event took place.

2023-07-20T19:13:09.504Z

Sample queries

The following sample queries are templates. Plug in whatever values make sense for your organization. You can also add alerts to these queries to help you stay informed about changes to your warehouses. See Create an alert.

Use the following sample queries to gain insight into warehouse behavior:

Which warehouses are actively running and for how long?

This query identifies which warehouses are currently active along with their running time in hours.

USE CATALOG `system`;

SELECT
we.warehouse_id,
we.event_time,
TIMESTAMPDIFF(MINUTE, we.event_time, CURRENT_TIMESTAMP()) / 60.0 AS running_hours,
we.cluster_count
FROM
compute.warehouse_events we
WHERE
we.event_type = 'RUNNING'
AND NOT EXISTS (
SELECT 1
FROM compute.warehouse_events we2
WHERE we2.warehouse_id = we.warehouse_id
AND we2.event_time > we.event_time
)

Alert opportunity: As a workspace admin you might want to be alerted if a warehouse is running longer than expected. For example, you can use the query results to set an alert condition to trigger when the running hours exceed a certain threshold.

Identify warehouses that are upscaled longer than expected

This query identifies which warehouses are currently active along with their running time in hours.

use catalog `system`;

SELECT
   we.warehouse_id,
   we.event_time,
   TIMESTAMPDIFF(MINUTE, we.event_time, CURRENT_TIMESTAMP()) / 60.0 AS upscaled_hours,
   we.cluster_count
FROM
   compute.warehouse_events we
WHERE
   we.event_type = 'SCALED_UP'
   AND we.cluster_count >= 2
   AND NOT EXISTS (
       SELECT 1
       FROM compute.warehouse_events we2
       WHERE we2.warehouse_id = we.warehouse_id
       AND (
           (we2.event_type = 'SCALED_DOWN') OR
           (we2.event_type = 'SCALED_UP' AND we2.cluster_count < 2)
       )
       AND we2.event_time > we.event_time
   )

Alert opportunity: Alerting on this condition can help you monitor resources and cost. You could set an alert for when the upscaled hours exceed a certain limit.

Warehouses that start for the first time

This query informs you about new warehouses that are starting for the first time.

use catalog `system`;

SELECT
   we.warehouse_id,
   we.event_time,
   we.cluster_count
FROM
   compute.warehouse_events we
WHERE
   (we.event_type = 'STARTING' OR we.event_type = 'RUNNING')
   AND NOT EXISTS (
       SELECT 1
       FROM compute.warehouse_events we2
       WHERE we2.warehouse_id = we.warehouse_id
       AND we2.event_time < we.event_time
   )

Alert opportunity: Alerting on new warehouses can help your organization track resource allocation. For example, you could set an alert that’s triggered every time a new warehouse starts.

Investigate billing charges

If you want to understand specifically what a warehouse was doing to generate billing charges, this query can tell you the exact dates and times the warehouse scaled up or down, or started and stopped.

use catalog `system`;

SELECT
    we.warehouse_id AS warehouse_id,
    we.event_type AS event,
    we.event_time AS event_time,
    we.cluster_count AS cluster_count
FROM
    compute.warehouse_events AS we
WHERE
    we.event_type IN (
        'STARTING', 'RUNNING', 'STOPPING', 'STOPPED',
        'SCALING_UP', 'SCALED_UP', 'SCALING_DOWN', 'SCALED_DOWN'
    )
    AND MONTH(we.event_time) = 7
    AND YEAR(we.event_time) = YEAR(CURRENT_DATE())
    AND we.warehouse_id = '19c9d68652189278'
ORDER BY
    event_time DESC

Which warehouses haven’t been used in the last 30 days?

This query helps you identify unused resources, providing an opportunity for cost optimization.

use catalog `system`;

SELECT
    we.warehouse_id,
    we.event_time,
    we.event_type,
    we.cluster_count
FROM
    compute.warehouse_events AS we
WHERE
    we.warehouse_id IN (
        SELECT DISTINCT
            warehouse_id
        FROM
            compute.warehouse_events
        WHERE
            MONTH(event_time) = 6
            AND YEAR(event_time) = YEAR(CURRENT_DATE())
    )
    AND we.warehouse_id NOT IN (
        SELECT DISTINCT
            warehouse_id
        FROM
            compute.warehouse_events
        WHERE
            MONTH(event_time) = 7
            AND YEAR(event_time) = YEAR(CURRENT_DATE())
    )
ORDER BY
    event_time DESC

Alert opportunity: Receiving an alert on unused resources could help your organization optimize costs. For example, you could set an alert that’s triggered when the query detects an unused warehouse.

Warehouses with the most uptime in a month

This query shows which warehouses have been used the most during a specific month. This query uses July as an example.

use catalog `system`;

SELECT
   warehouse_id,
   SUM(TIMESTAMPDIFF(MINUTE, start_time, end_time)) / 60.0 AS uptime_hours
FROM (
   SELECT
      starting.warehouse_id,
      starting.event_time AS start_time,
      (
         SELECT
            MIN(stopping.event_time)
         FROM
            compute.warehouse_events AS stopping
         WHERE
            stopping.warehouse_id = starting.warehouse_id
            AND stopping.event_type = 'STOPPED'
            AND stopping.event_time > starting.event_time
      ) AS end_time
   FROM
      compute.warehouse_events AS starting
   WHERE
      starting.event_type = 'STARTING'
      AND MONTH(starting.event_time) = 7
      AND YEAR(starting.event_time) = YEAR(CURRENT_DATE())
) AS warehouse_uptime
WHERE
   end_time IS NOT NULL
GROUP BY
   warehouse_id
ORDER BY
   uptime_hours DESC

Alert opportunity: You might want to keep track of high-utilization warehouses. For example, you could set an alert that’s triggered when the uptime hours for a warehouse exceed a specific threshold.

Warehouses that spent the most time upscaled during a month

This query informs you about warehouses that have spent significant time in the upscaled state during a month. This query uses July as an example.

use catalog `system`;

SELECT
   warehouse_id,
   SUM(TIMESTAMPDIFF(MINUTE, upscaled_time, downscaled_time)) / 60.0 AS upscaled_hours
FROM (
   SELECT
      upscaled.warehouse_id,
      upscaled.event_time AS upscaled_time,
      (
         SELECT
            MIN(downscaled.event_time)
         FROM
            compute.warehouse_events AS downscaled
         WHERE
            downscaled.warehouse_id = upscaled.warehouse_id
            AND (downscaled.event_type = 'SCALED_DOWN' OR downscaled.event_type = 'STOPPED')
            AND downscaled.event_time > upscaled.event_time
      ) AS downscaled_time
   FROM
      compute.warehouse_events AS upscaled
   WHERE
      upscaled.event_type = 'SCALED_UP'
      AND upscaled.cluster_count >= 2
      AND MONTH(upscaled.event_time) = 7
      AND YEAR(upscaled.event_time) = YEAR(CURRENT_DATE())
) AS warehouse_upscaled
WHERE
   downscaled_time IS NOT NULL
GROUP BY
   warehouse_id
ORDER BY
   upscaled_hours DESC limit 0;

Alert opportunity: You might want to keep track of high-utilization warehouses. For example, you could set an alert that’s triggered when the uptime hours for a warehouse exceed a specific threshold.

Warehouse running uptime by date

This query informs of the total number of minutes that warehouses were running, organized by date. This query can be used as the base for forecasting.

use catalog `system`;

SELECT
   warehouse_id,
   SUM(TIMESTAMPDIFF(MINUTE, upscaled_time, downscaled_time)) / 60.0 AS upscaled_hours
FROM (
   SELECT
      upscaled.warehouse_id,
      upscaled.event_time AS upscaled_time,
      (
         SELECT
            MIN(downscaled.event_time)
         FROM
            compute.warehouse_events AS downscaled
         WHERE
            downscaled.warehouse_id = upscaled.warehouse_id
            AND (downscaled.event_type = 'SCALED_DOWN' OR downscaled.event_type = 'STOPPED')
            AND downscaled.event_time > upscaled.event_time
      ) AS downscaled_time
   FROM
      compute.warehouse_events AS upscaled
   WHERE
      upscaled.event_type = 'SCALED_UP'
      AND upscaled.cluster_count >= 2
      AND MONTH(upscaled.event_time) = 7
      AND YEAR(upscaled.event_time) = YEAR(CURRENT_DATE())
) AS warehouse_upscaled
WHERE
   downscaled_time IS NOT NULL
GROUP BY
   warehouse_id
ORDER BY
   upscaled_hours DESC limit 0;