How to overwrite log4j
configurations on Databricks clusters
There is no standard way to overwrite log4j
configurations on clusters with custom configurations. You must overwrite the configuration files using init scripts.
The current configurations are stored in two log4j.properties
files:
On the driver:
%sh cat /home/ubuntu/databricks/spark/dbconf/log4j/driver/log4j.properties
On the worker:
%sh cat /home/ubuntu/databricks/spark/dbconf/log4j/executor/log4j.properties
To set class-specific logging on the driver or on workers, use the following script:
#!/bin/bash
echo "Executing on Driver: $DB_IS_DRIVER"
if [[ $DB_IS_DRIVER = "TRUE" ]]; then
LOG4J_PATH="/home/ubuntu/databricks/spark/dbconf/log4j/driver/log4j.properties"
else
LOG4J_PATH="/home/ubuntu/databricks/spark/dbconf/log4j/executor/log4j.properties"
fi
echo "Adjusting log4j.properties here: ${LOG4J_PATH}"
echo "log4j.<custom-prop>=<value>" >> ${LOG4J_PATH}
Replace <custom-prop>
with the property name, and <value>
with the property value.
Upload the script to DBFS and select a cluster using the cluster configuration UI.
You can also set log4j.properties
for the driver in the same way.
See Cluster Node Initialization Scripts for more information.