Parquet timestamp requires Hive metastore 1.2 or above

Problem

You are trying to create a Parquet table using TIMESTAMP, but you get an error message.

Error in SQL statement: QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.UnsupportedOperationException: Parquet does not support timestamp. See HIVE-6384

Example code

CREATE EXTERNAL TABLE IF NOT EXISTS testTable (
  emp_name STRING,
  joing_datetime TIMESTAMP,
)
PARTITIONED BY
  (date DATE)
STORED AS
  PARQUET
LOCATION
  "/mnt/<path-to-data>/emp.testTable"

Cause

Parquet requires a Hive metastore version of 1.2 or above in order to use TIMESTAMP.

Note

The default Hive metastore client version used in Databricks Runtime is 0.13.0.

Solution

You must upgrade the Hive metastore client on the cluster.

You can do this by adding the following settings to the cluster’s Spark configuration.

  • Databricks Runtime 6.6 and below
spark.sql.hive.metastore.version 1.2.1
spark.sql.hive.metastore.jars builtin
  • Databricks Runtime 7.0 and above
spark.sql.hive.metastore.jars /dbfs <path-to-downloaded-jars>
spark.sql.hive.metastore.version 1.2.1

Note

For Databricks Runtime 7.0 and above you must download the metastore jars and point to them as detailed in the Databricks documentation.