Error java.lang.UnsupportedOperationException when trying to read datetime data files

Set spark.sql.legacy.parquet.datetimeRebaseModeInRead to LEGACY.

Written by Vidhi Khaitan

Last published at: December 23rd, 2024

Problem

When trying to read datetime data files, you encounter an error. 

 

java.lang.UnsupportedOperationException with the message: "LEGACY datetime rebase mode is only supported for files written in UTC timezone. Actual file timezone: Asia/Kolkata." This error occurs when attempting to read data files that were written in a timezone other than UTC while using the LEGACY datetime rebase mode.

 

Cause

The data files are written in a time zone other than UTC. The LEGACY datetime rebase mode in Apache Spark is designed to handle datetime values based on the UTC timezone. When files are written in a different timezone, such as Asia/Kolkata, the rebase mode cannot correctly interpret the datetime values, leading to the UnsupportedOperationException.

 

Solution

Configure your Spark cluster’s datetime rebase mode.

The spark.sql.legacy.parquet.datetimeRebaseModeInRead configuration allows Spark to read the datetime values in the legacy rebase mode, even if the files were written in a timezone other than UTC. 

  1. Navigate to the cluster configuration page in your Databricks workspace.
  2. Click the Advanced Options toggle.
  3. Add the following configuration in the Spark configuration tab. 

 

spark.sql.legacy.parquet.datetimeRebaseModeInRead LEGACY

 

For more information, review the Spark Parquet Files documentation.