Problem: Invalid Timestamp When Loading Data Into Amazon Redshift

Problem

When you use a spark-redshift write operation to save timestamp data to Amazon Redshift, the following error can occur if that timestamp data includes timezone information.

Error (code 1206) while loading data into Redshift: "Invalid timestamp format or value [YYYY-MM-DD HH24:MI:SSOF]"

Cause

The Redshift table is using the Timestamp data type that doesn’t store timezone information.

Solution

Include the option .options("extracopyoptions", "TIMEFORMAT 'auto'") as shown in the following Scala code:

df.write
.format("com.databricks.spark.redshift")
.options(...)
.options("extracopyoptions", "TIMEFORMAT 'auto'")
.mode("append")
.save()

If you specify auto as the argument for the DATEFORMAT or TIMEFORMAT parameter, Amazon Redshift automatically recognizes and converts the date format or time format in your source data.