Create a workspace using the account console
This article describes how to create and manage workspaces using the account console. Alternatively, you can create a workspace using the Account API or Terraform.
Before you begin
- Be sure that you understand all configuration settings before you create a new workspace. Workspace configurations cannot be modified after you create the workspace.
- You must have some required Google permissions on your account, which can be a Google Account or a service account. See Required permissions.
- Be sure you have enough Google Cloud resource quotas needed for the workspace. Request a quota increase if you need to.
Choose a network type
Before you create your workspace, you must choose where you want the workspace to be deployed:
- Databricks-managed VPC (default): Databricks creates and manages the lifecycle of the VPC. If you choose this network type, there are no additional steps to perform now.
- Customer-managed VPC: Create and specify your own customer-managed VPC for your new Databricks workspace to use. If you choose this network type, perform the following steps now:
- Review all customer-managed VPC requirements.
- Create your VPC.
- Register your network configuration, which represents your VPC and its subnets.
Create a workspace
To create a workspace:
-
As a Databricks account admin, log in to the account console and click the Workspaces icon.
-
Click Create Workspace.
-
In the Workspace Name field, enter a human-readable name for this workspace. Only alphanumeric characters, underscores, and hyphens are allowed, and the name must be 3-30 characters long.
-
In the Region field, select a region for your workspace’s network and clusters. For the list of supported regions, see Databricks clouds and regions.
-
In the Google cloud project ID field, enter your Google Cloud project ID. If you are deploying in a customer-managed VPC, the ID depends on if you are using a standalone or Shared VPC:
- For a standalone VPC, set this to the project ID for your VPC.
- For a Shared VPC, set this to the project ID for this workspace’s resources.
-
Network setup. This step varies based on the workspace's network type. For a customer-managed VPC, click the Customer-managed VPC tab.
- Databricks-managed VPC
- Customer-managed VPC
Optionally click Advanced configurations to specify custom IP ranges for the GCE subnet. If you leave these fields blank, Databricks uses defaults. For sizing guidance, see Subnet sizing for a new workspace
Sizes must use the CIDR format. IP addresses must be entirely within the following ranges:
10.0.0.0/8
,100.64.0.0/10
,172.16.0.0/12
,192.168.0.0/16
, and240.0.0.0/4
.- Specify a network configuration that represents your VPC and its subnets:
- Network Mode: Set this to Customer-managed network.
- Network configuration: Select your network configuration's name.
-
(Optional) Enable Google Private Service Connect (PSC) on the workspace to secure the workspace with private connectivity and mitigate data exfiltration risks. To configure this, click Advanced configurations and choose a private access settings object. Before adding PSC configuration, Databricks recommends reading Enable Private Service Connect for your workspace for requirements and context.
-
(Optional) Add customer-managed keys configurations for managed services, workspace storage, or both. You can select the same configuration for both managed services and workspace storage if it supports both use cases.
-
Click Save.
-
If this is the first time that you have created a workspace, a Google popup window asks you to select your Google account and consent to the request for additional scopes. If the popup window does not appear and the page does not change, you may have a popup blocker in your web browser.
-
Confirm that your workspace was created successfully. Next to your workspace in the list of workspaces, click Open.
-
Secure the workspace’s GCS buckets. See Secure the workspace's GCS buckets in your project.
Enabling Google APIs on a workspace's project
During workspace creation, Databricks automatically enables the following required Google APIs on the Google Cloud project if they are not already enabled:
These APIs are not disabled automatically during workspace deletion.
Workspace creation limits
You can create at most 200 workspaces per week in the same Google Cloud project. If you exceed this limit, creating a workspace fails with the error message: “Creating custom cloud IAM role <your-role> in project <your-project> rejected.”
View workspace status
After you create a workspace, you can view its status on the Workspaces page.
- Provisioning: In progress. Wait a few minutes and refresh the page.
- Running: Successful workspace deployment.
- Failed: Failed deployment.
- Banned: Contact your Databricks account team.
- Cancelling: In the process of cancellation.
If the status for your new workspace is Failed, click the workspace to view a detailed error message. If you do not understand the error, contact your Databricks account team.
You cannot update the configuration of a failed workspace. You must delete it and create a new workspace.
Log into a workspace
- Go to the account console and click the Workspaces icon.
- On the row with your workspace, click Open.
Secure the workspace's GCS buckets in your project
When you create a workspace, Databricks on Google Cloud creates two Google Cloud Storage GCS buckets in your GCP project:
- One GCS bucket stores system data such as notebook revisions, job run details, command results, and Spark logs.
- One GCS bucket is your workspace’s root storage for the Databricks File System (DBFS). Your DBFS root bucket is not intended for storage of production customer data. Create other data sources and storage for production customer data in additional GCS buckets. You can optionally mount the additional GCS buckets as the Databricks File System (DBFS) mounts. See Connect to Google Cloud Storage.
Databricks strongly recommends that you secure these GCS buckets so that they are not accessible from outside Databricks on Google Cloud.
To secure these GCS buckets:
-
In a browser, go to the GCP Cloud Console.
-
Select the Google Cloud project that hosts your Databricks workspace.
-
Go to that project's Storage Service page.
-
Look for the buckets for your new workspace. Their names are:
databricks-<workspace id>
databricks-<workspace id>-system
-
For each bucket:
-
Click on the bucket to view details.
-
Click the Permissions tab.
-
Review all the entries of the Members list and determine if access is expected for each member.
-
Check the IAM Condition column. Some permissions, such as those named “Databricks service account for workspace”, have IAM Conditions that restrict them to certain buckets. The Google Cloud console UI does not evaluate the condition, so it may show roles that would not actually be able to access the bucket.
For roles without any IAM Condition, consider adding restrictions on these:
-
When adding Storage permissions at the project level or above, use IAM Conditions to exclude Databricks buckets or to only allow specific buckets.
-
Choose the minimal set of permissions needed. For example, if only read access is needed, specify Storage Viewer instead of Storage Admin.
warningDo not use Basic Roles because they are too broad.
-
-
Enable Google Cloud Data Access audit logging. Databricks strongly recommends that you enable Data Access audit logging for the GCS buckets that Databricks creates. This enables faster investigation of any issues that may come up. Be aware that Data Access audit logging can increase GCP usage costs. For instructions, see Configuring Data Access audit logs.
-
If you have questions about securing these GCS buckets, contact your Databricks account team.
Next steps
Now that you have deployed a workspace, you can start building out your data strategy. Databricks recommends the following articles:
- Add users, groups, and service principals to your workspace. Manage users, service principals, and groups.
- Learn about data governance and privileges in Databricks. See What is Unity Catalog?.
- Connect your Databricks workspace to your external data sources. See Connect to data sources and external services.
- Ingest your data into the workspace. See Standard connectors in Lakeflow Connect.
- Learn about managing access to workspace objects like notebooks, compute, dashboards, queries. See Access control lists.