Getting error com.univocity.parsers.common.TextParsingException when parsing data

Make sure the delimiter used in your read operation matches the format of your input files.

Last published at: June 10th, 2025

Problem

You use the following code to parse data using a notebook or Auto Loader.

df = (spark. readStream
format ("cloudFiles")
.option ("cloudFiles. format", "csv" )
.option("useStrictGlobber", "true")
.option ("header", "true")
.option ("sep", ";")
.option ("cloudFiles.schemaLocation"
.schema_location)
.load (source_path) )

You then receive the following error.

Py4JJavaError: An error occurred while calling o693.load.

: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3) (10.139.64.10 executor driver): com.univocity.parsers.common.TextParsingException: java.lang.ArrayIndexOutOfBoundsException - 20480

Hint: Number of columns processed may have exceeded limit of 20480 columns. Use settings.setMaxColumns(int) to define the maximum number of columns your input can have

Ensure your configuration is correct, with delimiters, quotes and escape sequences that match the input format you are trying to parse

Cause

Univocity parser, a Java library used by Apache Spark internally to parse CSV/text files, is causing the error. When univocity parser cannot properly parse text data, it throws a TextParsingException runtime error.

The failure to parse text data arises when a row is malformed.

Solution

First, verify that the delimiter used in your read operation matches the format of your input files.

In a notebook, run the following code to ensure your read configuration is accurate. This code uses a semicolon delimiter. If you’re using a comma, you can set “,” as the second parameter.

df = spark.read.option("delimiter", ";").csv("</path/to/file.csv>")

If you’re unsure about which delimiter is in use, open the file directly using Databricks File System (DBFS) or use the Data tab in the Databricks UI to preview it.

Then make the necessary corrections to the data.

Databricks Help Center

Problem

Cause

Solution

Contact Us