Problem
When you create a table in ORC format, either in Hive or with Hive’s CREATE
syntax in Databricks, the table is not recognized as an ORC table when processed in serverless warehouses. You receive the following error.
"Can't insert into the target. Can only write data to relations with a single path but given paths are []. SQLSTATE: 42809"
However, the INSERT
command works on all-purpose clusters. This may block you from migrating to a serverless warehouse.
Cause
INSERT
is not supported for Hive tables with any column that has a timestamp-millis
data type in serverless warehouses, and Hive tables often use that data type.
Solution
To enable the ORC table to work in serverless warehouses, recreate the ORC table as an Apache Spark format ORC table with "USING ORC"
instead of "STORED AS ORC"
in the CREATE TABLE
statement.
Example
CREATE TABLE hive_metastore.hms_schema.orc_table (
column1 STRING,
column2 INT,
column3 TIMESTAMP
)
USING ORC
PARTITIONED BY (partition_column STRING, dynamic_partition STRING);
Preventative measures
Databricks recommends converting all existing Hive tables to Spark tables to avoid similar issues in the future. For more information, refer to the INSERT documentation.