Get and set Apache Spark configuration properties in a notebook

Written by mathan.pillai

Last published at: May 26th, 2022

In most cases, you set the Spark config (AWS | Azure) at the cluster level. However, there may be instances when you need to check (or set) the values of specific Spark configuration properties in a notebook.

This article shows you how to display the current value of a Spark configuration property in a notebook. It also shows you how to set a new value for a Spark configuration property in a notebook.

Get Spark configuration properties

To get the current value of a Spark config property, evaluate the property without including a value.

Python

%python

spark.conf.get("spark.<name-of-property>")

R

%r

library(SparkR)
sparkR.conf("spark.<name-of-property>")

Scala

%scala

spark.conf.get("spark.<name-of-property>")

SQL

%sql

GET spark.<name-of-property>;

Set Spark configuration properties

To set the value of a Spark configuration property, evaluate the property and assign a value.

Delete

Info

You can only set Spark configuration properties that start with the spark.sql prefix.

Python

%python

spark.conf.set("spark.sql.<name-of-property>", <value>)

R

%r

library(SparkR)
sparkR.session()
sparkR.session(sparkConfig = list(spark.sql.<name-of-property> = "<value>"))

Scala

%scala

spark.conf.set("spark.sql.<name-of-property>", <value>)

SQL

%sql

SET spark.sql.<name-of-property> = <value>;

Examples

Get the current value of spark.rpc.message.maxSize.

%sql

SET spark.rpc.message.maxSize;

Set the value of spark.sql.autoBroadcastJoinThreshold to -1.

%python

spark.conf.set("spark.sql.autoBroadcastJoinThreshold", -1)