Problem
You need to retrieve the last access date/time of a table across all workspaces, but the per-table approach using SHOW TABLE EXTENDED
command, or checking the insights tab, seems slow and inefficient.
Cause
SHOW TABLE EXTENDED
queries each table one at a time, one workspace at a time, causing a surge of requests to hit your metastore.
Solution
Databricks recommends using audit log system tables.
- First, check that you have the necessary permissions to access audit logs. You can grant permissions using the Databricks SQL query
GRANT SELECT ON system.access.audit TO <username>
. - Run the following SQL query to retrieve the last access date/time of a table. Replace
<your-database-name>
and<your-table-name>
with your respective database and table names.
SELECT
action_name as `EVENT`,
event_time as `WHEN`,
request_params, user_identity.email,
IFNULL(request_params.full_name_arg, 'Non-specific') AS `TABLE ACCESSED`,
IFNULL(request_params.commandText,'GET table') AS `QUERY TEXT`
FROM system.access.audit
WHERE request_params.full_name_arg = 'catalog_<your-database-name>.<your-table-name>'
AND action_name IN ('createTable', 'commandSubmit','getTable','deleteTable')
order by event_time DESC