Problem
When attempting to write data from Databricks to Vertica using the Vertica Spark Connector, the following error occurs during the save operation in the PySpark notebook.
py4j.protocol.Py4JJavaError: An error occurred while calling o1436.save.
: org.apache.spark.SparkException: Writing job failed.
...
Caused by: java.lang.NoSuchMethodError: java.lang.String.isBlank()Z
at com.vertica.spark.util.schema.SchemaTools.findEmptyColumnName$1(SchemaTools.scala:681)
Cause
The Vertica Spark Connector code is calling the isBlank()
method on a Java string, which requires at least Java 11, but your Databricks environment is running a Java version below 11.
In Java versions below 11, the isBlank()
method is not available, causing the error.
Solution
Either upgrade your Java version to 11 or above, or upgrade Databricks Runtime to 16.4 LTS or above, which has Java 17 enabled by default.
Upgrade to Java 11
To upgrade the Java version, either use an init script or set an environment variable.
Use an init script
- Create a
.sh
file. - You can adapt and use the following example init script to change your cluster’s Java version from Java 8 to Java 11.
#!/bin/bash
set -e
JNAME="zulu11-ca-amd64"
update-java-alternatives -s $JNAME
JNAME_PATH=`update-java-alternatives -l $JNAME | awk '{print $3}'`
export JAVA_HOME=$JNAME_PATH
echo "JAVA_HOME=$JNAME_PATH" >> /etc/environment
java --version
cat /databricks/spark/conf/spark-env.sh
- Save the file to your volume, workspace, or S3 location.
- Configure the init script. For details, refer to the “Configure a cluster-scoped init script using the UI” section of the Cluster-scoped init scripts (AWS | Azure | GCP) documentation.
Set an environment variable
- On the compute configuration page, click Advanced to expand the section.
- Click Spark in the vertical navigation.
- Add the following line in the Environment variables field to set an environment variable.
JAVA_HOME=/usr/lib/jvm/zulu11-ca-amd64
For more information, refer to the “Environment variables” section of the Compute configuration reference (AWS | Azure | GCP) documentation.