By default, all-purpose cluster configurations are deleted 30 days after the cluster was last terminated. It is possible to keep a cluster configuration for longer than 30 days if an administrator pins the cluster.
In either situation, it is possible for an administrator to manually delete a cluster configuration at any time.
If you try to run a job on a cluster that has had its configuration deleted, the run fails with a cluster does not exist error message.
Run executed on existing cluster ID <cluster_id> failed since the cluster does not exist.
Databricks audit logs can be used to record the activities in your workspace, allowing you to monitor detailed Databricks usage patterns.
Audit logging is NOT enabled by default and requires a few API calls to initialize the feature.
Please review the Configure audit logging documentation for instructions on how to setup audit logging in your Databricks workspace.
If a cluster configuration is deleted unexpectedly, you can use the audit logs to identify who deleted the cluster configuration and when it was deleted.
Instructions
Once audit logging is enabled on your workspace, you can use it to find information on who deleted a specific cluster configuration.
Load audit logs
Before you can search through the audit logs, you must load them as a DataFrame and register the DataFrame as a temp table.
You will need to provide the S3 bucket name, the full path to the audit logs, and a name for the table.
Please review the Working with data in Amazon S3 documentation for more information.
%scala val df = spark.read.format("json").load("s3a://<s3-bucket-name>/<path-to-audit-logs>") df.createOrReplaceTempView("<audit-logs>")
Query audit log table
Once you have the audit logs in a table, you can use SQL to query them.
This article contains two example queries, showing how to find information on a specific cluster, as well as how to view all clusters that were deleted within a specific date range.
You can use these examples to build your own custom queries.
Display information on a specific cluster
This example query returns details on the cluster deletion event such as who deleted, when the cluster it was deleted.
You need to provide the name of the audit log table and the cluster ID of the deleted cluster.
%sql select workspaceId, userIdentity.email, sourceIPAddress, to_timestamp(timestamp / 1000) as evenTimeStamp, ServiceName, actionName, requestParams.cluster_id as clusterId from <audit-logs> where serviceName = "clusters" AND actionName = "permanentDelete" AND requestParams.cluster_id = "<cluster-id>"
Display clusters deleted within a specific range
This example query returns a list of all clusters that were deleted during a specific date range.
You need to provide the name of the audit log table as well as the start date and the end date of the search period.
%sql select workspaceId, userIdentity.email, sourceIPAddress, to_timestamp(timestamp / 1000) as evenTimeStamp, ServiceName, actionName, requestParams.cluster_id as clusterId from <audit-logs> where serviceName = "clusters" AND actionName = "permanentDelete" AND date >= "<start-date>" #Date is in yyyy-MM-dd format AND date <="<end-date>" #Date is in yyyy-MM-dd format