How to overwrite log4j configurations on Databricks clusters

There is no standard way to overwrite log4j configurations on clusters with custom configurations. You must overwrite the configuration files using init scripts.

The current configurations are stored in two files:

  • On the driver:

    cat /home/ubuntu/databricks/spark/dbconf/log4j/driver/
  • On the worker:

    cat /home/ubuntu/databricks/spark/dbconf/log4j/executor/

To set class-specific logging on the driver or on workers, use the following script:

echo "Executing on Driver: $DB_IS_DRIVER"
if [[ $DB_IS_DRIVER = "TRUE" ]]; then
echo "Adjusting here: ${LOG4J_PATH}"
echo "log4j.<custom-prop>=<value>" >> ${LOG4J_PATH}

Replace <custom-prop> with the property name, and <value> with the property value.

Upload the script to DBFS and select a cluster using the cluster configuration UI.

You can also set for the driver in the same way.

See Cluster Node Initialization Scripts for more information.