Download billable usage logs using the Account API

Preview

This feature is in Public Preview.

As a Databricks account admin, you can use the account console to download billable usage logs. To access this data programmatically, you can also use the Account API to download the logs. This article explains how to call that API.

Alternatively, you can configure daily delivery of billable usage logs in CSV file format to an AWS S3 storage bucket. See Deliver and access billable usage logs.

Requirements

  • Email address and password for an account admin to authenticate with the APIs. The email address and password are both case sensitive.

  • Account ID. You can find your account ID in the account console

How to authenticate to the Account API

The Account API is published on the accounts.cloud.databricks.com base endpoint for all AWS regional deployments.

Use the following base URL for API requests: https://accounts.cloud.databricks.com/api/2.0/.

Preview

OAuth for service principals is in public preview.

To authenticate to the Account API, you can use Databricks OAuth tokens for service principals or an account admin’s username and password. Databricks strongly recommends that you use OAuth tokens for service principals. A service principal is an identity that you create in Databricks for use with automated tools, jobs, and applications. To create an OAuth token, see Authentication using OAuth tokens for service principals.

Use the following examples to authenticate to the Account API:

Pass the OAuth token in the header using Bearer authentication. For example:

export OAUTH_TOKEN=<oauth-access-token>

curl -X GET --header "Authorization: Bearer $OAUTH_TOKEN" \
'https://accounts.cloud.databricks.com/api/2.0/accounts/<accountId>/<endpoint>'

In this section, username refers to an account admin’s email address. There are several ways to provide your credentials to tools such as curl.

  • Pass your username and account password separately in the headers of each request in <username>:<password> syntax.

    curl -X GET -u <username>:<password> -H "Content-Type: application/json" \
      'https://accounts.cloud.databricks.com/api/2.0/accounts/<accountId>/<endpoint>'
    
  • Apply base64 encoding to your <username>:<password> string and provide it directly in the HTTP header:

    curl -X GET -H "Content-Type: application/json" \
      -H 'Authorization: Basic <base64-username-pw>'
      'https://accounts.cloud.databricks.com/api/2.0/accounts/<accountId>/<endpoint>'
    
  • Create a .netrc file with machine, login, and password properties:

    machine accounts.cloud.databricks.com
    login <username>
    password <password>
    

    To invoke the .netrc file, use -n in your curl command:

    curl -n -X GET 'https://accounts.cloud.databricks.com/api/2.0/accounts/<account-id>/workspaces'
    

This article’s examples use OAuth for service principals for authentication. For the complete API reference, see Databricks REST API reference.

Call the billable usage log download API

To download billable usage data, call the billable usage download API (GET '/accounts/<account-id>/usage/download).

Add the following query fields:

  • start_month: (Required) The month and year at which log delivery starts. Format is text in YYYY-MM format. You can enter any month and year on or after 2019-03.

  • end_month: (Required) The month and year at which log delivery ends. Format is text in YYYY-MM format. You can enter any month and year on or after 2019-03.

  • personal_data: (Optional) Specify whether to include personally identifiable information in the billable usage logs, for example the email addresses of cluster creators. Handle this information with care. The default is false, which means do not include this information.

For example:

curl -X GET
  'https://accounts.cloud.databricks.com/api/2.0/accounts/<databricks-account-id>/usage/download?start_month=2020-01&end_month=2020-12&personal_data=false' \
  --header 'Authorization: Bearer $OAUTH_TOKEN'

Using the log files for analysis

For the CSV schema, see CSV file schema

For information about how to analyze these files using Databricks, see Analyze usage data in Databricks