Skip to main content

Enable workload identity federation for AWS IAM workloads

AWS workloads can authenticate to Databricks without long-term secrets by exchanging an AWS-signed OIDC identity token. The recommended path calls the AWS STS GetWebIdentityToken API, which works anywhere the workload has AWS credentials.

Use cases

  • Lambda functions calling Databricks APIs (triggering jobs, querying SQL warehouses)
  • EC2/ECS-based ETL pipelines authenticating to Databricks without secrets
  • EKS-based ML workloads accessing model serving endpoints
  • Cross-account patterns where workloads in one AWS account federate into a Databricks account managed by another team
  • Security posture improvement by eliminating long-lived Databricks PATs or secrets from AWS Secrets Manager

AWS prerequisites

The following steps must be completed to enable workload identity federation for AWS IAM workloads.

Step 1: Enable AWS IAM outbound identity federation

Enable outbound identity federation on your AWS account:

Python
import boto3
boto3.client('iam').enable_outbound_web_identity_federation()

You can also enable this in the IAM Console under Account Settings > Enable "Outbound web identity federation".

Step 2: Grant sts:GetWebIdentityToken permission

Grant the workload's IAM role sts:GetWebIdentityToken permission:

JSON
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["sts:GetWebIdentityToken"],
"Resource": "*",
"Condition": {
"ForAllValues:StringEquals": {
"sts:IdentityTokenAudience": "databricks"
},
"NumericLessThanEquals": {
"sts:DurationSeconds": 300
}
}
}
]
}
note

The audience condition ensures the role can only request tokens targeted at Databricks. The duration condition limits token lifetime to 300 seconds. You can adjust the duration up to 3600 seconds based on your workload needs, but shorter lifetimes are recommended.

Step 3: Note the account-specific issuer URL

Retrieve the account-specific issuer URL:

Python
import boto3
info = boto3.client('iam').get_outbound_web_identity_federation_info()
print(info['IssuerUrl']) # https://<uuid>.tokens.sts.global.api.aws

Create a federation policy

important

Databricks federation policies are created at the account level (not workspace level). The Databricks CLI host must be set to https://accounts.gcp.databricks.com and the user must be an account admin.

Create a workload identity federation policy using the Databricks CLI. Set the issuer to your account-specific issuer URL from Step 3. For detailed instructions, see Configure a service principal federation policy.

Bash
databricks account service-principal-federation-policy create ${SP_ID} --json '{
"oidc_policy": {
"issuer": "https://<uuid>.tokens.sts.global.api.aws",
"audiences": ["databricks"],
"subject": "arn:aws:iam::<account-id>:role/<workload-role-name>"
}
}'

Authenticate to Databricks

After you create the federation policy, use the Databricks SDK to authenticate your AWS workloads. The following example uses the SDK's IdTokenSource pattern to retrieve an AWS STS token and exchange it for a Databricks OAuth token.

Python
import boto3
from databricks.sdk import WorkspaceClient
from databricks.sdk import oidc
from databricks.sdk.core import Config, credentials_strategy, oidc_credentials_provider


class AwsStsTokenSource(oidc.IdTokenSource):
def __init__(self, audience="databricks", region="us-east-1"):
self._audience = audience
self._region = region

def id_token(self) -> oidc.IdToken:
sts = boto3.client("sts", region_name=self._region)
resp = sts.get_web_identity_token(
Audience=[self._audience],
SigningAlgorithm="RS256",
DurationSeconds=300,
)
return oidc.IdToken(jwt=resp["WebIdentityToken"])


@credentials_strategy("aws-sts-wif", [])
def aws_sts_wif_strategy(cfg: Config):
return oidc_credentials_provider(cfg, AwsStsTokenSource())


w = WorkspaceClient(
host="https://my-workspace.cloud.databricks.com",
client_id="<service-principal-uuid>",
credentials_strategy=aws_sts_wif_strategy
)
# No secrets needed
clusters = w.clusters.list()
note

Token duration of 300 seconds is recommended. You can adjust up to 3600 seconds based on workload needs.

For a manual token exchange example, see Authenticate with an identity provider token.

AWS documentation references