Problem
You’re running an SQL query that performs a DML operation DELETE
on a Hive table. The query fails with the following error.
org.apache.spark.sql.AnalysisException: [UNSUPPORTED_FEATURE.TABLE_OPERATION] The feature is not supported: Table `<catalog>`.`<schema>`.`<table>` does not support DELETE. Please check the current catalog and namespace to make sure the qualified table name is expected, and also check the catalog implementation which is configured by "spark.sql.catalog".
Cause
Apache Spark does not support ACID transactional operations, such as DELETE
, UPDATE
, MERGE
, or INSERT
on Hive tables. Hive tables are not ACID-compliant. As a result, these operations will fail if you attempt them directly.
This behavior is a known Spark limitation (see SPARK-15348). For more information, refer to the Apache Hive compatibility (AWS | Azure | GCP) documentation.
Solution
Convert your Hive tables to Delta tables to perform DML operations. Delta is ACID-compliant and supports transactional operations in Spark.
1. Read the Hive table into a DataFrame.
df = spark.table("zone_apd.cc_bm_page_load_prod")
2. Write to Delta format.
df.write.format("delta").mode("overwrite").option(“path”, <path-to-storage>saveAsTable("<catalog>.<table>.<schema>")
To read more about migration to Delta Lake, refer to the What is Delta Lake in Databricks? (AWS | Azure | GCP) documentation.