How to overwrite log4j configurations on Databricks clusters

Learn how to overwrite log4j configurations on Databricks clusters.

Written by Adam Pavlacka

Last published at: February 29th, 2024


This article describes steps related to customer use of Log4j 1.x within a Databricks cluster. Log4j 1.x is no longer maintained and has three known CVEs (CVE-2021-4104, CVE-2020-9488, and CVE-2019-17571). If your code uses one of the affected classes (JMSAppender or SocketServer), your use may potentially be impacted by these vulnerabilities. You should not enable either of these classes in your cluster.

There is no standard way to overwrite log4j configurations on clusters with custom configurations. You must overwrite the configuration files using init scripts.

The current configurations are stored in two files:

  • On the driver:
    cat /home/ubuntu/databricks/spark/dbconf/log4j/driver/
  • On the worker:
    cat /home/ubuntu/databricks/spark/dbconf/log4j/executor/

To set class-specific logging on the driver or on workers, use the following script:


echo "Executing on Driver: $DB_IS_DRIVER"
if [[ $DB_IS_DRIVER = "TRUE" ]]; then
echo "Adjusting here: ${LOG4J_PATH}"
echo "log4j.<custom-prop>=<value>" >> ${LOG4J_PATH}

Replace <custom-prop> with the property name, and <value> with the property value.

Upload the script to DBFS and select a cluster using the cluster configuration UI.

You can also set for the driver in the same way.

See Cluster node initialization scripts (AWS | Azure | GCP) for more information.

Was this article helpful?