Delete table when underlying S3 bucket is deleted
Problem
You are trying to drop or alter a table when you get an error.
Error in SQL statement: IOException: Bucket_name … does not exist
You can reproduce the error with a DROP TABLE
or ALTER TABLE
command.
DROP TABLE <database-name.table-name>;
ALTER TABLE <database-name.table-name> SET LOCATION "<file-system-location>";
Cause
You deleted the contents of the underlying S3 bucket before dropping the tables.
Because the data no longer exists, you get an error when trying to drop the table.
Solution
You can use spark.sessionState.catalog.externalCatalog.dropTable
to delete the table.
import org.apache.spark.sql.hive.HiveUtils
spark.sessionState.catalog.externalCatalog.dropTable("<database-name>", "<table-name>", ignoreIfNotExists = false, purge = false)