Common errors in notebooks
There are some common issues that occur when using notebooks. This section outlines some of the frequently asked questions and best practices that you should follow.
Spark job fails with java.lang.NoClassDefFoundError
Sometimes you may come across an error like:
java.lang.NoClassDefFoundError: Could not initialize class line.....$read$
This can occur with a Spark Scala 2.11 cluster and a Scala notebook, if you mix together
a case class definition and Dataset/DataFrame operations in the same
notebook cell, and later use the case class in a Spark job in a different cell.
For example, in the first cell, say you define a case class MyClass
and also created a Dataset.
case class MyClass(value: Int)
val dataset = spark.createDataset(Seq(1))
Then in a later cell, you create instances of MyClass
inside a Spark job.
dataset.map { i => MyClass(i) }.count()
Spark job fails with java.lang.UnsupportedOperationException
Sometimes you may come across an error like:
java.lang.UnsupportedOperationException: Accumulator must be registered before send to executor
This can occur with a Spark Scala 2.10 cluster and a Scala notebook. The reason and solution for this error are same as that of Spark job fails with java.lang.NoClassDefFoundError.