Databricks recently published a blog on Log4j 2 Vulnerability (CVE-2021-44228) Research and Assessment. Databricks does not directly use a version of Log4j known to be affected by this vulnerability within the Databricks platform in a way we understand may be vulnerable.
If you are using Log4j within your cluster (for example, if you are processing user-controlled strings through Log4j), your use may be potentially vulnerable to the exploit if you have installed, and are using, an affected version or have installed services that transitively depend on an affected version.
This article explains how to check your cluster for installed versions of Log4j 2 and how to upgrade those instances.
Check to see if Log4j 2 is installed
Check for a manual install
Manually review the libraries installed on your cluster (AWS | Azure | GCP).
If you have explicitly installed a version of Log4j 2 via Maven, it is listed under Libraries in the cluster UI (AWS | Azure | GCP).
Scan the classpath
Scan your classpath to check for a version of Log4j 2.
- Start your cluster.
- Attach a notebook to your cluster.
- Run this code to scan your classpath:
%scala { import scala.util.{Try, Success, Failure} import java.lang.ClassNotFoundException Try(Class.forName("org.apache.logging.log4j.core.Logger", false, this.getClass.getClassLoader)) match { case Success(loggerCls) => Option(loggerCls.getPackage) match { case Some(pkg) => println(s"Version: ${pkg.getSpecificationTitle} ${pkg.getSpecificationVersion}") case None => println("Could not determine Log4J 2 version") } case Failure(e: ClassNotFoundException) => println("Could not load Log4J 2 class") case Failure(e) => println(s"Unexpected Error: $e") throw e } }
- If Log4j 2 is NOT PRESENT on your classpath, you see a result like this:
Could not load Log4J 2 class
- If Log4j 2 is PRESENT on your classpath, you should see a result like this, which includes the Log4j 2 version:
Version: Apache Log4j Core 2.15.0
Scan all user installed jars
Locate all of the user installed jar files on your cluster and run a scanner to check for vulnerable Log4j 2 versions.
- Start your cluster.
- Attach a notebook to your cluster.
- Run this code to identify the location of the jar files:
%scala import org.apache.spark._ val sparkEnv = SparkEnv.get val field = SparkEnv.get.getClass.getDeclaredField("driverTmpDir") field.setAccessible(true) println(s"Your jars are installed under ${field.get(sparkEnv).asInstanceOf[Option[String]].get}\n")
- The code displays the location of your jar files.
Your jars are installed under /local_disk0/spark-1a6be695-9318-463c-b966-256c32e3771c/userFiles-582ca64b-93c9-444c-85b8-7779bd2c5e52
- Download the jar files to your local machine.
- Run a scanner like Logpresso to check for vulnerable Log4j 2 versions.
Upgrade your Log4j 2 version
Upgrade via cluster UI
- If you manually installed Log4j 2 via the cluster UI, ensure that it is version 2.17 or above. In this case, no action is required.
- If you manually installed Log4j 2 via the cluster UI, and it is 2.16 or below, you should uninstall the library from the cluster (AWS | Azure | GCP) and install version 2.17 or above.
Upgrade via command line
If you have installed Log4j 2 via command line (or via SSH), use the same method to upgrade Log4j 2 to a secure version.
Upgrade custom built jar
If you include Log4j 2 in a custom built jar, upgrade Log4j 2 to a secure version and rebuild your jar.
Re-attach the updated jar to your cluster.
Restart your cluster after upgrading
Restart your cluster after upgrading Log4j 2.