Create a classic workspace
This article describes how to create and manage workspaces using the account console. Alternatively, you can create a workspace using the Account API or Terraform.
Before you begin
- Be sure that you understand all configuration settings before you create a new workspace. Workspace configurations cannot be modified after you create the workspace.
- You must have some required Google permissions on your account, which can be a Google Account or a service account. See Required permissions for workspace creation.
- Be sure you have enough Google Cloud resource quotas needed for the workspace. Request a quota increase if you need to.
Prepare a network configuration (optional)
If you want to deploy your workspace in a customer-managed VPC, register a network configuration before creating the workspace:
- Review all customer-managed VPC requirements.
- Create your VPC.
- Register your network configuration, which represents your VPC and its subnets.
Create a workspace
To create a workspace:
- As a Databricks account admin, log in to the account console and click the Workspaces icon.
- Click Create Workspace.
- In the Basics section:
- In the Workspace Name field, enter a name for this workspace. Only alphanumeric characters, underscores, and hyphens are allowed, and the name must be 3-30 characters long.
- In the Region dropdown, select a region for your workspace’s network and clusters. For supported regions, see Databricks clouds and regions.
- In the GCP project ID field, enter your Google Cloud project ID. If you have a project but do not know its ID, go to the Google Cloud Platform Manage Resources page, find your project, and copy its ID.
If you are deploying in a customer-managed VPC, the ID depends on whether you are using a standalone or Shared VPC:
- For a standalone VPC, set this to the project ID for your VPC.
- For a Shared VPC, set this to the project ID for this workspace’s resources.
- In the Networking configuration dropdown, select or create a the workspace's networking configuration. By default, this is set to Databricks-managed VPC.
- (Optional) In the Networking section, configure network settings:
- In the Subnet CIDR (IP Range) field, optionally enter a custom subnet IP range. IP addresses must be in CIDR format and within:
10.0.0.0/8,100.64.0.0/10,172.16.0.0/12,192.168.0.0/16, and240.0.0.0/4. For sizing guidance, see Subnet sizing for a new workspace. - In the Network connectivity configuration dropdown, select a network connectivity configuration to enable Google Private Service Connect (PSC), or create one inline. Before configuring PSC, see Enable Private Service Connect for your workspace for requirements.
- In the Private access settings dropdown, select a private access settings configuration to enable Google Private Service Connect (PSC), or create one inline. Before configuring PSC, see Enable Private Service Connect for your workspace for requirements.
- In the Subnet CIDR (IP Range) field, optionally enter a custom subnet IP range. IP addresses must be in CIDR format and within:
- (Optional) In the Advanced section, you can configure any advanced settings for your workspace. See Advanced configurations.
- Click Create workspace. You are automatically redirected to the workspace details page.
- If this is the first time that you have created a workspace, a Google popup window asks you to select your Google account and consent to the request for additional scopes. If the popup window does not appear and the page does not change, you may have a popup blocker in your web browser. Sign in with the same Google email as the one being used to sign into Databricks.
- Databricks redirects you to the workspace details page. Confirm that your workspace status is Running.
- Secure the workspace’s GCS buckets. See Secure the workspace's GCS buckets in your project.
Advanced configurations
The following configurations are optional when you create a new workspace. To view these settings, click the Advanced dropdown in the workspace creation page.
- Encryption: You can add encryption keys to your workspace deployment for managed services and workspace storage. The key for managed services encrypts notebooks, secrets, and Databricks SQL query data in the control plane. The key for workspace storage encrypts your workspace storage bucket and the GCS buckets of compute resources in the classic compute plane. For more guidance, see Configure customer-managed keys for encryption.
- Security and compliance: These checkboxes allow you to enable the compliance security profile, add compliance standards, and enable enhanced security monitoring for your workspace. For more information, see Configure enhanced security and compliance settings.
Enabling Google APIs on a workspace's project
During workspace creation, Databricks automatically enables the following required Google APIs on the Google Cloud project if they are not already enabled:
These APIs are not disabled automatically during workspace deletion.
Workspace creation limits
You can create at most 200 workspaces per week in the same Google Cloud project. If you exceed this limit, creating a workspace fails with the error message: “Creating custom cloud IAM role <your-role> in project <your-project> rejected.”
View workspace status
After you create a workspace, you can view its status on the Workspaces page.
- Provisioning: In progress. Wait a few minutes and refresh the page.
- Running: Successful workspace deployment.
- Failed: Failed deployment.
- Banned: Contact your Databricks account team.
- Cancelling: In the process of cancellation.
If the status for your new workspace is Failed, click the workspace to view a detailed error message. If you do not understand the error, contact your Databricks account team.
You cannot update the configuration of a failed workspace. You must delete it and create a new workspace.
Log in to the workspace
- Go to the account console and click the Workspaces icon.
- On the row with your workspace, click Open.
Secure the workspace's GCS buckets in your project
When you create a workspace, Databricks on Google Cloud creates two Google Cloud Storage GCS buckets in your GCP project:
- One GCS bucket stores system data such as notebook revisions, job run details, command results, and Spark logs.
- One GCS bucket is your workspace’s root storage for the Databricks File System (DBFS). Your DBFS root bucket is not intended for storage of production customer data. Create other data sources and storage for production customer data in additional GCS buckets. You can optionally mount the additional GCS buckets as the Databricks File System (DBFS) mounts. See Connect to Google Cloud Storage.
Databricks strongly recommends that you secure these GCS buckets so that they are not accessible from outside Databricks on Google Cloud.
To secure these GCS buckets:
- In a browser, go to the GCP Cloud Console.
- Select the Google Cloud project that hosts your Databricks workspace.
- Go to that project's Storage Service page.
- Look for the buckets for your new workspace. Their names are:
databricks-<workspace id>databricks-<workspace id>-system
- For each bucket:
-
Click on the bucket to view details.
-
Click the Permissions tab.
-
Review all the entries of the Members list and determine if access is expected for each member.
-
Check the IAM Condition column. Some permissions, such as those named “Databricks service account for workspace”, have IAM Conditions that restrict them to certain buckets. The Google Cloud console UI does not evaluate the condition, so it may show roles that would not actually be able to access the bucket.
For roles without any IAM Condition, consider adding restrictions on these:
-
When adding Storage permissions at the project level or above, use IAM Conditions to exclude Databricks buckets or to only allow specific buckets.
-
Choose the minimal set of permissions needed. For example, if only read access is needed, specify Storage Viewer instead of Storage Admin.
warningDo not use Basic Roles because they are too broad.
-
-
Enable Google Cloud Data Access audit logging. Databricks strongly recommends that you enable Data Access audit logging for the GCS buckets that Databricks creates. This enables faster investigation of any issues that may come up. Be aware that Data Access audit logging can increase GCP usage costs. For instructions, see Configuring Data Access audit logs.
-
If you have questions about securing these GCS buckets, contact your Databricks account team.
Next steps
Now that you have deployed a workspace, you can start building out your data strategy. Databricks recommends the following articles:
- Add users, groups, and service principals to your workspace. Manage users, service principals, and groups.
- Learn about data governance and privileges in Databricks. See What is Unity Catalog?.
- Connect your Databricks workspace to your external data sources. See Connect to data sources and external services.
- Ingest your data into the workspace. See Standard connectors in Lakeflow Connect.
- Learn about managing access to workspace objects like notebooks, compute, dashboards, queries. See Access control lists.