Problem
Sometimes you cannot drop a table from the Databricks UI. Using %sql or spark.sql to drop table doesn’t work either.
Cause
The metadata (table schema) stored in the metastore is corrupted. When you run Drop table command, Spark checks whether table exists or not before dropping the table. Since the metadata is corrupted for the table Spark can’t drop the table and fails with following exception.
%scala com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: The metadata is corrupted
Solution
Use a Hive client to drop the table since the Hive client doesn’t check for the table existence as Spark does. To drop a table:
- Create a function inside Hive package.
%scala package org.apache.spark.sql.hive { import org.apache.spark.sql.hive.HiveUtils import org.apache.spark.SparkContext object utils { def dropTable(sc: SparkContext, dbName: String, tableName: String, ignoreIfNotExists: Boolean, purge: Boolean): Unit = { HiveUtils .newClientForMetadata(sc.getConf, sc.hadoopConfiguration) .dropTable(dbName, tableName, ignoreIfNotExists, false) } } }
- Drop corrupted tables.
%scala import org.apache.spark.sql.hive.utils utils.dropTable(sc, "default", "my_table", true, true)