BROADCAST_VARIABLE_NOT_LOADED or JVM_ATTRIBUTE_NOT_SUPPORTED errors when using broadcast variables in a shared access mode cluster

Use a single-user cluster or pass a variable into a function as a state instead.

Written by kaushal.vachhani

Last published at: November 6th, 2024

Problem

You’re trying to broadcast a variable on a shared access mode cluster and receive error messages such as BROADCAST_VARIABLE_NOT_LOADED or JVM_ATTRIBUTE_NOT_SUPPORTED.

 

Cause

Databricks shared access mode clusters do not support broadcast variables due to their enhanced isolation architecture. Trying to use broadcast variables will lead to the error BROADCAST_VARIABLE_NOT_LOADED.

If you are using shared clusters in Databricks Runtime 14.0 and above, you see JVM_ATTRIBUTE_NOT_SUPPORTED in PySpark or value sparkContext is not a member of org.apache.spark.sql.SparkSession in Scala.

 

Solution

If you need to use broadcast variables, Databricks recommends using single-user clusters for such workloads. This will allow you to bypass the isolation and enable the use of broadcast variables. 

If you prefer to continue using a shared cluster, pass the variables into functions as a state parameter instead of using broadcast variables. 

For more information on shared access mode limitations, please refer to the Compute access mode limitations for Unity Catalog (AWSAzureGCP) documentation.