This article describes how Databricks SQL administrators configure a new workspace for access to data objects.
- If you are using Databricks managed tables you do not need to configure access to cloud storage.
- Databricks SQL endpoints all share the same cloud storage access credentials.
To configure data access for Databricks SQL, follow the steps in this section:
Databricks recommends setting up a new instance profile with access to all S3 buckets that should be accessed from Databricks SQL.
A Databricks administrator performs one of the following steps in the AWS Console:
- (Optional) Create an instance profile to access an S3 bucket. If you want to reuse an existing instance profile, you can skip this step.
- If you are reusing an instance profile, copy the ARN from IAM Service > Roles in the roles summary of the role you want to reuse.
A Databricks administrator performs the following steps in the AWS console.
- Create a bucket policy for all the target S3 buckets. Repeat this step for all the buckets you want to access from DB SQL.
- Note the IAM role used to create the Databricks deployment.
- Add the S3 IAM role to the EC2 policy
A Databricks administrator performs this step in the Data Science & Engineering workspace admin console:
A Databricks administrator specifies data access configuration in the Databricks SQL admin console.
Click Settings at the bottom of the sidebar and select SQL Admin Console.
Click the SQL Endpoint Settings tab.
A Databricks administrator or data object owner performs this step in the Databricks SQL query editor. They grant privileges to users or groups by issuing GRANT (Databricks SQL) statements.
For each group of users, assign permissions to objects. It is common to do this at the database level. This could be as simple as an administrator or owner issuing the following command in Databricks SQL:
GRANT USAGE, SELECT, READ_METADATA ON DATABASE sales TO `analysts`
This command gives read access to the
analysts group on the
Privileges are inherited, so granting read permission on the database allows
read access to all the tables and views stored in the database, including any
future objects added to the database. For a detailed explanation of the
privileges that can be granted to users and groups, see
A Databricks administrator performs this step in a notebook in a Data Science & Engineering workspace.
Administrators set owners using ALTER TABLE (Databricks SQL) statements. The simplest option is to set the owner to a group of admins. Alternatively, to enable a delegated security model, you can select different owners for each database, giving each the ability to manage permissions on the objects in the database.