Problem
When you try to navigate to your cluster > Spark UI > SQL/DataFrame, you notice the physical plan is not logging. When you try to access the SQL/DataFrame section, you receive the following error.
HTTP ERROR 500 java.util.NoSuchElementException: 3?__main__?+=0000000000000158?+=0000000000000000
Caused by:
java.util.NoSuchElementException: 3?__main__?+=0000000000000158?+=0000000000000000
at org.apache.spark.util.kvstore.LevelDB.get(LevelDB.java:133)
at org.apache.spark.util.kvstore.LevelDB.read(LevelDB.java:147)
at org.apache.spark.util.kvstore.DatabricksHybridStore.read(DatabricksHybridStore.scala:37)
at org.apache.spark.sql.execution.ui.SQLAppStatusStore.planGraph(SQLAppStatusStore.scala:81)
at org.apache.spark.sql.execution.ui.EdgeExecutionPage.$anonfun$render$14(ExecutionPage.scala:413)
at scala.Option.map(Option.scala:230)
at org.apache.spark.sql.execution.ui.EdgeExecutionPage.render(ExecutionPage.scala:347)
at org.apache.spark.ui.WebUI.$anonfun$attachPage$1(WebUI.scala:109)
at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:101)
Cause
Your sparkPlanInfo
structure and its nested children exceed the default size limit of 2MB (2097152 bytes) for a single event in Apache Spark's event logging. This limit is controlled by the spark.eventLog.unknownRecord.maxSize
configuration.
Solution
Increase the maximum size for unknown records to 16MB by adding the following Spark config to your cluster.
spark.eventLog.unknownRecord.maxSize 16m
For details on how to apply Spark configs, refer to the “Spark configuration” section of the Compute configuration reference (AWS | Azure | GCP) documentation.
Important
Increasing the maximum size to 16MB increases the likelihood of more memory being consumed. Depending on your workflow, you may require a change to a cluster with more memory.