Getting Started with Databricks Serverless Private Git
Databricks Serverless Private Git is in Public Preview. There is a charge for compute and networking costs incurred from serverless compute resources connecting to external resources. For more information on billing, see Understand Databricks serverless networking costs.
What is Serverless Private Git?
Databricks Serverless Private Git lets you connect a Databricks workspace to a private Git server using serverless compute and AWS PrivateLink. A Git server is private if it cannot be accessed from the internet.
The following diagram illustrates the overall system architecture:
Why use Serverless Private Git?
Compared to the Git proxy, Serverless Private Git offers the following advantages:
-
Serverless Private Git acquires serverless compute only when it receives a Git request, and it can be inactive when not in use. In contrast, the Git proxy requires the proxy cluster to be active when the user submits a Git request.
-
Serverless Private Git uses PrivateLink to securely connect to the private git instance.
Set up Serverless Private Git
-
Follow the steps to set up a VPC endpoint service for your private Git server. This VPC endpoint allows you to create an AWS PrivateLink connection from Serverless to backends in your network behind an NLB.
-
An administrator must create an AWS interface VPC endpoint in the Databricks NCC for each Git server. If the workspace needs to connect to multiple private Git servers, an administrator must create an AWS interface VPC endpoint in the Databricks NCC for each Git server.
-
Create a network connectivity configuration (NCC) to configure egress to a network load balancer. Considerations for this step include the following:
- Only one NCC can be configured for a workspace for private Git. If the workspace needs to connect to multiple private Git servers, verify that they can be connected using the same NCC.
- Limitations such as the number of NCCs supported in a region and the number of workspaces that can be attached to an NCC are documented here.
- Add a private endpoint rule.
-
Wait at least 10 minutes after setting up the NCC private rule endpoints, and then, at the workspace level, enable the Serverless Private Git Public Preview on the previews page.
-
Go to the workspace and try a Git operation. You should see a UI indicator for Serverless Private Git. This page might take a few seconds to load.
After you configure it, Serverless Private Git takes precedence over other forms of private Git connectivity you already provisioned, such as classic Git proxy and Enterprise private Git. If you have a Git proxy cluster running, pause it after setting up Serverless Private Git.
Additional Configurations
Customize your git operations using the config.json file.
- Create a configuration file at
/Workspace/.git_settings/config.json
, following the specification below. - Grant all Git users View permissions to the configuration file and any CA cert files referenced by the configuration file.
- Interact with Git to validate connectivity to the Git remote, such as cloning a Git folder for a remote repository on the server.
- Changes to the configuration file may take up to 1 minute to be applied.
Top-Level Config File Structure
{
"default": { ... }, // Optional global settings
"remotes": \[ ... \] // Optional list of per-remote settings
}
`default` Section (Optional)
Global defaults are applied to all Git operations unless overridden by a specific remote.
Field | Type | Required | Default Value | Description |
---|---|---|---|---|
sslVerify | boolean | No | true | Whether to verify SSL certificates. |
caCertPath | string | No | "" (empty) | Workspace path to a custom CA certificate. |
httpProxy | string | No | "" (empty) | HTTP proxy to route Git traffic through. |
customHttpPort | integer | No | Unspecified | Custom HTTP port of the Git server. |
`remotes` Section (Optional)
A list of objects defining settings for individual remote Git servers. These settings override the `default` block on a per-remote basis.
Field | Type | Required | Default Value | Description |
---|---|---|---|---|
urlPrefix | string | Yes | — | Prefix to match Git remote URLs. |
sslVerify | boolean | No | true | Whether to verify SSL certificates. |
caCertPath | string | No | "" (empty) | Workspace path to a custom CA certificate path for this remote. |
httpProxy | string | No | "" (empty) | HTTP proxy to route Git traffic through. |
customHttpPort | integer | No | Unspecified | Custom HTTP port of the Git server. |
Example Config With No Remote-Specific Configuration
{
"default": {
"sslVerify": false
}
}
Full Config Example
{
"default": {
"sslVerify": true,
"caCertPath": "/Workspace/my\_ca\_cert.pem",
"httpProxy": "https://git-proxy-server.company.com",
"customHttpPort": "8080",
},
"remotes": \[
{
"urlPrefix": "https://my-private-git.company.com/",
"caCertPath": "/Workspace/my\_ca\_cert\_2.pem"
},
{
"urlPrefix": "https://another-git-server.com/project.git",
"sslVerify": false
}
\]
}
Notes
- The
default
section must be present, even if only partially. - The
remotes
list is optional and can be omitted entirely. - Each remote entry must contain at least the
urlPrefix
. - If you don't specify a value for a field, it uses the default value.
- Unknown fields are ignored.
Other Security recommendations
If the secure egress gateway is enabled, verify that the network policy in the account console applied to the specific workspace includes the Git Server’s FQDN in the allowed internet destination.
Limitations
- Serverless proxy log is currently unavailable.
- Available on AWS Serverless regions only. For a list of supported regions, see Databricks clouds and regions.
- You cannot create a VPC Endpoint Service to serve workspaces in multiple regions. Currently, the AWS NCC is a regional object and doesn’t support multi-region VPC endpoints. To connect another workspace from a different region to the Git server, set up another NLB and VPC Endpoint Service in the new region.