Date functions only accept int values in Apache Spark 3.0

Date functions only accept int values in Apache Spark 3.0; fractional and string values return AnalysisException error.

Written by Adam Pavlacka

Last published at: February 28th, 2023

Problem

You are attempting to use the date_add() or date_sub() functions in Spark 3.0, but they are returning an Error in SQL statement: AnalysisException error message.

In Spark 2.4 and below, both functions work as normal.

%sql

select date_add(cast('1964-05-23' as date), '12.34')

Cause

You are attempting to use a fractional or string value as the second argument.

In Spark 2.4 and below, if the second argument is a fractional or string value, it is coerced to an int value before date_add() or date_sub() is evaluated.

Using the example code listed above, the value 12.34 is converted to 12 before date_add() is evaluated.

In Spark 3.0, if the second argument is a fractional or string value, it returns an error.

Solution

Use int, smallint, or tinyint values as the second argument for the date_add() or date_sub() functions in Spark 3.0.

%sql

select date_add(cast('1964-05-23' as date), '12')
%sql

select date_add(cast('1964-05-23' as date), 12)

Both of these examples work properly in Spark 3.0.

Delete

Info

If you are importing this data from another source, you should create a routine to sanitize the values and ensure the data is in integer form before passing it to one of the date functions.