ANALYZE TABLE (Databricks SQL)

The ANALYZE TABLE statement collects statistics about one specific table or all the tables in one specified database, that are to be used by the query optimizer to find a better query execution plan.

This statement only applies to Delta Lake tables.

Syntax

ANALYZE TABLE table_name [ PARTITION clause ]
    COMPUTE STATISTICS [ NOSCAN | FOR COLUMNS col1 [, ...] | FOR ALL COLUMNS ]

ANALYZE TABLES [ { FROM | IN } database_name ] COMPUTE STATISTICS [ NOSCAN ]

Parameters

  • table_name

    Identifies the table to be analyzed. The name must not include a temporal specification.

  • PARTITION clause

    Optionally limits the command to a subset of partitions.

  • [ NOSCAN | FOR COLUMNS col [, …] | FOR ALL COLUMNS ]

    If no analyze option is specified, ANALYZE TABLE collects the table’s number of rows and size in bytes.

    • NOSCAN

      Collect only the table’s size in bytes ( which does not require scanning the entire table ).

    • FOR COLUMNS col [, …] | FOR ALL COLUMNS

      Collect column statistics for each column specified, or alternatively for every column, as well as table statistics.

  • { FROM | IN } [database_name](sql-ref-names.md#database-name

    Specifies the name of the database to be analyzed. Without a database name, ANALYZE TABLES collects all tables in the current database that the current user has permission to analyze.

  • NOSCAN

    Collects only the table’s size in bytes (which does not require scanning the entire table).

  • FOR COLUMNS col [ , … ] | FOR ALL COLUMNS

    Collects column statistics for each column specified, or alternatively for every column, as well as table statistics.

If no analyze option is specified, both number of rows and size in bytes are collected.

Examples

> CREATE TABLE students (name STRING, student_id INT) PARTITIONED BY (student_id);
> INSERT INTO students PARTITION (student_id = 111111) VALUES ('Mark');
> INSERT INTO students PARTITION (student_id = 222222) VALUES ('John');

> ANALYZE TABLE students COMPUTE STATISTICS NOSCAN;

> DESC EXTENDED students;
             col_name            data_type comment
 -------------------- -------------------- -------
                 name               string    null
           student_id                  int    null
                  ...                  ...     ...
           Statistics            864 bytes
                  ...                  ...     ...

> ANALYZE TABLE students COMPUTE STATISTICS;

> DESC EXTENDED students;
             col_name            data_type comment
 -------------------- -------------------- -------
                 name               string    null
           student_id                  int    null
                  ...                  ...     ...
           Statistics    864 bytes, 2 rows
                  ...                  ...     ...

> ANALYZE TABLE students PARTITION (student_id = 111111) COMPUTE STATISTICS;

> DESC EXTENDED students PARTITION (student_id = 111111);
             col_name            data_type comment
 -------------------- -------------------- -------
                 name               string    null
           student_id                  int    null
                  ...                  ...     ...
 Partition Statistics    432 bytes, 1 rows
                  ...                  ...     ...
         OutputFormat org.apache.hadoop...

> ANALYZE TABLE students COMPUTE STATISTICS FOR COLUMNS name;

> DESC EXTENDED students name;
      info_name info_value
 -------------- ----------
       col_name       name
      data_type     string
        comment       NULL
            min       NULL
            max       NULL
      num_nulls          0
 distinct_count          2
    avg_col_len          4
    max_col_len          4
      histogram       NULL

> ANALYZE TABLES IN school_db COMPUTE STATISTICS NOSCAN;
> DESC EXTENDED teachers;
             col_name            data_type comment
 -------------------- -------------------- -------
                 name               string    null
           teacher_id                  int    null
                  ...                  ...     ...
           Statistics           1382 bytes
                  ...                  ...     ...

> DESC EXTENDED students;
             col_name            data_type comment
 -------------------- -------------------- -------
                 name               string    null
           student_id                  int    null
                  ...                  ...     ...
           Statistics            864 bytes
                  ...                  ...     ...

> ANALYZE TABLES COMPUTE STATISTICS;
> DESC EXTENDED teachers;
             col_name            data_type comment
 -------------------- -------------------- -------
                 name               string    null
           teacher_id                  int    null
                  ...                  ...     ...
           Statistics   1382 bytes, 2 rows
                  ...                  ...     ...

> DESC EXTENDED students;
             col_name            data_type comment
 -------------------- -------------------- -------
                 name               string    null
           student_id                  int    null
                  ...                  ...     ...
           Statistics    864 bytes, 2 rows
                  ...                  ...     ...