SELECT clause (Databricks SQL)

Composes a result set from one or more tables. The SELECT clause can be part of a query which also includes common table expressions (CTE), set operations, and various other clauses.

Syntax

SELECT [ hints ] [ ALL | DISTINCT ] { named_expression | star_clause } [, ...]
  FROM from_item [, ...]
  [ LATERAL VIEW clause ]
  [ PIVOT clause ]
  [ WHERE clause ]
  [ GROUP BY clause ]
  [ HAVING clause]

from_item
{ table_name [ table_alias ] |
  JOIN clause |
  table_valued_function [ table_alias ] |
  VALUES clause |
  [ LATERAL ] ( query ) [ table_alias ] }

named_expression
   expression [ column_alias ]

star_clause
   [ { table_name | view_name } . ] *

Parameters

  • hints

    Hints help the Databricks SQL optimizer make better planning decisions. Databricks SQL supports hints that influence selection of join strategies and repartitioning of the data.

  • ALL

    Select all matching rows from the relation. Enabled by default.

  • DISTINCT

    Select all matching rows from the relation after removing duplicates in results.

  • named_expression

    An expression with an optional assigned name.

    • expression

      A combination of one or more values, operators, and SQL functions that evaluates to a value.

    • column_alias

      An optional column identifier naming the expression result. If no column_alias is provided Databricks SQL derives one.

  • star_clause

    A short hand to name all the referencable column in the FROM clause. The list of columns is ordered by the order of from_items and the order of columns within each from_item.

    • table_name

      If present limits the columns to be named to those in the specified referencable table.

    • view_name

      If specified limits the columns to be expanded to those in the specified referencable view.

  • from_item

    A source of input for the SELECT. One of the following:

  • table_alias

    Optionally specifies a label for the from_item. If the table_alias includes column_identifiers their number must match the number of columns in the from_item.

  • PIVOT

    Used for data perspective; you can get the aggregated values based on specific column value.

  • LATERAL VIEW

    Used in conjunction with generator functions such as EXPLODE, which generates a virtual table containing one or more rows. LATERAL VIEW applies the rows to each original output row.

  • WHERE

    Filters the result of the FROM clause based on the supplied predicates.

  • GROUP BY

    The expressions that are used to group the rows. This is used in conjunction with aggregate functions (MIN, MAX, COUNT, SUM, AVG) to group rows based on the grouping expressions and aggregate values in each group. When a FILTER clause is attached to an aggregate function, only the matching rows are passed to that function.

  • HAVING

    The predicates by which the rows produced by GROUP BY are filtered. The HAVING clause is used to filter rows after the grouping is performed. If you specify HAVING without GROUP BY, it indicates a GROUP BY without grouping expressions (global aggregate).

Select on Delta table

In addition to the standard SELECT options, Delta tables support the time travel options described in this section. For details, see Query an older snapshot of a table (time travel).

AS OF syntax

SELECT * FROM table_identifier TIMESTAMP AS OF timestamp_expression
SELECT * FROM table_identifier VERSION AS OF version
  • timestamp_expression can be any one of:
    • '2018-10-18T22:15:12.013Z', that is, a string that can be cast to a timestamp
    • cast('2018-10-18 13:36:32 CEST' as timestamp)
    • '2018-10-18', that is, a date string
    • current_timestamp() - interval 12 hours
    • date_sub(current_date(), 1)
    • Any other expression that is or can be cast to a timestamp
  • version is a long value that can be obtained from the output of DESCRIBE HISTORY table_spec.

Neither timestamp_expression nor version can be subqueries.

Example

SELECT * FROM events TIMESTAMP AS OF '2018-10-18T22:15:12.013Z'
SELECT * FROM delta.`/mnt/delta/events` VERSION AS OF 123

@ syntax

Use the @ syntax to specify the timestamp or version. The timestamp must be in yyyyMMddHHmmssSSS format. You can specify a version after @ by prepending a v to the version. For example, to query version 123 for the table events, specify events@v123.

Example

SELECT * FROM events@20190101000000000
SELECT * FROM events@v123