Parquet timestamp requires Hive metastore 1.2 or above

Update the Hive metastore to version 1.2 or above to use TIMESTAMP with a Parquet table.

Written by rakesh.parija

Last published at: May 16th, 2022

Problem

You are trying to create a Parquet table using TIMESTAMP, but you get an error message.

Error in SQL statement: QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.UnsupportedOperationException: Parquet does not support timestamp. See HIVE-6384

Example code

%sql

CREATE EXTERNAL TABLE IF NOT EXISTS testTable (
  emp_name STRING,
  joing_datetime TIMESTAMP,
)
PARTITIONED BY
  (date DATE)
STORED AS
  PARQUET
LOCATION
  "/mnt/<path-to-data>/emp.testTable"

Cause

Parquet requires a Hive metastore version of 1.2 or above in order to use TIMESTAMP.

Delete

Info

The default Hive metastore client version used in Databricks Runtime is 0.13.0.

Solution

You must upgrade the Hive metastore client on the cluster.

You can do this by adding the following settings to the cluster’s Spark config (AWS | Azure | GCP).

  • Databricks Runtime 6.6 and below
    spark.sql.hive.metastore.version 1.2.1
    spark.sql.hive.metastore.jars builtin
  • Databricks Runtime 7.0 and above
    spark.sql.hive.metastore.jars /dbfs <path-to-downloaded-jars>
    spark.sql.hive.metastore.version 1.2.1
Delete

Info

For Databricks Runtime 7.0 and above you must download the metastore jars and point to them (AWS | Azure | GCP) as detailed in the Databricks documentation.