Problem
When running an Apache Spark workflow in Databricks, you encounter an import failure related to Protobuf-java
. You see the following error while checking the logs.
java.lang.NoSuchMethodError: com.google.protobuf.Internal.checkNotNull(Ljava/lang/Object;)Ljava/lang/Object; at com.google.protobuf.SingleFieldBuilderV3.<init>(SingleFieldBuilderV3.java:57)
Cause
There is a Protobuf library version mismatch between Databricks Runtime and your JAR files.
Databricks includes a specific version of Protobuf-java
in its Runtime. If you import a JAR file that depends on a different Protobuf version, it can cause conflicts, leading to import errors.
Solution
Shade the Protobuf class in the JAR. Shading ensures that the required version of Protobuf is packaged within the JAR and does not conflict with Databricks Runtime’s built-in version.
What Does It Mean to "Shade a JAR"?
Shading a JAR is a technique used in Java development to relocate and package dependencies inside a JAR file to avoid conflicts with other libraries or the runtime environment. It makes sure the included dependencies do not interfere with versions already present in the classpath.
For more information and instructions, refer to the Deploy Scala JARs on Unity Catalog clusters (AWS | Azure | GCP) documentation.