Common errors in notebooks

Learn about common errors from Databricks notebooks.

Written by Adam Pavlacka

Last published at: May 16th, 2022

There are some common issues that occur when using notebooks. This section outlines some of the frequently asked questions and best practices that you should follow.

Spark job fails with java.lang.NoClassDefFoundError

Sometimes you may come across an error like:

%scala

java.lang.NoClassDefFoundError: Could not initialize class line.....$read$

This can occur with a Spark Scala 2.11 cluster and a Scala notebook, if you mix together a case class definition and Dataset/DataFrame operations in the same notebook cell, and later use the case class in a Spark job in a different cell. For example, in the first cell, say you define a case class MyClass and also created a Dataset.

%scala

case class MyClass(value: Int)

val dataset = spark.createDataset(Seq(1))

Then in a later cell, you create instances of MyClass inside a Spark job.

%scala

dataset.map { i => MyClass(i) }.count()

Solution

Move the case class definition to a cell of its own.

%scala

case class MyClass(value: Int)   // no other code in this cell
%scala

val dataset = spark.createDataset(Seq(1))
dataset.map { i => MyClass(i) }.count()

Spark job fails with java.lang.UnsupportedOperationException

Sometimes you may come across an error like:

java.lang.UnsupportedOperationException: Accumulator must be registered before send to executor

This can occur with a Spark Scala 2.10 cluster and a Scala notebook. The reason and solution for this error are same as the prior Spark job fails with java.lang.NoClassDefFoundError.