External data sources

Preview

This feature is in Public Preview. Contact your Databricks representative to request access.

An external data source is a computation resource that lets you run SQL commands on a set of data objects external to the Databricks environment.

SQL commands on an external data source go through a single account so you cannot identify who ran the query. You cannot see the status of an external data source in Databricks SQL Analytics and you must configure access to objects within the data source when you configure the external data source.

External data sources support SQL commands native to the external data source.

You must be a Databricks admin to manage external data sources.

The other type of data source is a SQL endpoint.

Note

You do not incur DBU charges to process SQL Analytics queries on external data sources. Databricks reserves the right to cap or throttle this free usage or may begin charging for such usage in the future.

Create an external data source

  1. Click the User Settings Icon icon at the bottom of the sidebar and select Settings.

  2. Click the External Data Sources tab.

  3. Click + New Data Source.

  4. Select a data source type and follow the configuration instructions:

  5. Click Create.

Supported external data source types

SQL Analytics supports the following external data source types:

  • Amazon Athena
  • Amazon DynamoDB
  • Amazon Redshift
  • Axibase Time Series Database
  • Cassandra
  • ClickHouse
  • CockroachDB
  • DB2 by IBM
  • Druid
  • ElasticSearch
  • Google Analytics
  • Google BigQuery
  • Google Sheets
  • Graphite
  • Hive
  • Impala
  • InfluxDB
  • JIRA
  • JSON
  • Apache Kylin
  • MemSQL
  • Microsoft Azure Synapse Analytics
  • Microsoft Azure SQL Database
  • Microsoft SQL Server
  • MongoDB
  • MySQL
  • Oracle
  • PostgreSQL
  • Presto
  • Prometheus
  • Qubole
  • Rockset
  • Salesforce
  • ScyllaDB
  • Snowflake
  • TreasureData
  • Vertica
  • Yandex AppMetrica
  • Yandex Metrica