Analyze audit logs

Note

The examples in this article don’t reference the audit log system table location. Databricks recommends using system tables (Public Preview) to view and query your audit log data. For more information, see Audit log system table reference.

You can analyze audit logs using Databricks. The following example uses delivered logs to report on Databricks access and Apache Spark versions. If you are using system tables to access audit logs, see Audit log system table reference.

Load audit logs as a DataFrame and register the DataFrame as a temp table. See Connect to Amazon S3 for a detailed guide.

val df = spark.read.format("json").load("s3a://bucketName/path/to/auditLogs")
df.createOrReplaceTempView("audit_logs")

List the users who accessed Databricks and from where.

%sql
SELECT DISTINCT userIdentity.email, sourceIPAddress
FROM audit_logs
WHERE serviceName = "accounts" AND actionName LIKE "%login%"

Check the Apache Spark versions used.

%sql
SELECT requestParams.spark_version, COUNT(*)
FROM audit_logs
WHERE serviceName = "clusters" AND actionName = "create"
GROUP BY requestParams.spark_version

Check table data access.

%sql
SELECT *
FROM audit_logs
WHERE serviceName = "sqlPermissions" AND actionName = "requestPermissions"