Introduction
You want to track the usage history (SELECT
queries) for specific tables, but encounter difficulties in identifying which users have executed SELECT
commands against arbitrary schemas or tables.
Instructions
Use audit logs to identify users who have performed getTable
actions against the specified tables. The audit logs are stored in the system.access.audit
table, which you can query using Apache Spark SQL.
Add your desired date range to the query event_date
parameter and include the full path to the schemas you wish to audit: catalog.schema
.
SELECT
event_time,
event_date,
user_identity.email AS user_email,
service_name,
action_name,
request_params,
source_ip_address,
user_agent
FROM system.access.audit
WHERE action_name = 'getTable'
AND (
request_params['full_name_arg'] LIKE '%<ADD-SCHEMA1-NAME-HERE>%'
OR request_params['full_name_arg'] LIKE '%<ADD-SCHEMA2-NAME-HERE>%'
OR request_params['full_name_arg'] LIKE '%<ADD-SCHEMA3-NAME-HERE>%'
)
AND event_date BETWEEN '2025-02-01' AND '2025-03-04'
ORDER BY event_time DESC;