List users who executed SELECT command against a list of schemas using system access table

Query the audit logs in the system.access.audit table.

Written by julian.campabadal

Last published at: April 14th, 2025

Problem

You wish to track the usage history (select queries) for specific tables in Databricks. You may encounter difficulties in identifying which users have executed SELECT commands against arbitrary schemas/tables.

 

Cause

You may not be familiar with writing the SQL queries needed to access system tables information.

 

Solution

The audit logs can be used to identify users who have performed getTable actions (i.e. SELECT commands) against the specified tables. The audit logs are stored in the system.access.audit table, which can be queried using Apache Spark SQL.

Add the desired date range to the query event_date parameter and include the full path to the schemas you wish to audit (i.e. catalog.schema).

%sql

SELECT 
   event_time,
   event_date,
   user_identity.email AS user_email,
   service_name,
   action_name,
   request_params,
   source_ip_address,
   user_agent
FROM system.access.audit
WHERE action_name = 'getTable'
AND (
   request_params['full_name_arg'] LIKE '%<add-schema1-name-here>%'
   OR request_params['full_name_arg'] LIKE '%<add-schema2-name-here>%'
   OR request_params['full_name_arg'] LIKE '%<add-schema3-name-here>%'
)
AND event_date BETWEEN '<start-date-as-yyyy-mm-dd>' AND '<end-date-as-yyyy-mm-dd>'
ORDER BY event_time DESC;

The query output contains all tables that have been selected which are part of the schemas added. The request_params field contains the table name, and user_identity is the user that performed the action. If there is no output, it means no users have selected the tables under the schemas you included in the query.