Cluster fails to start with dummy does not exist error

Cluster is not starting due to a `dummy does not exist` Apache Spark error message.

Written by arvind.ravish

Last published at: March 4th, 2022

Problem

You try to start a cluster, but it fails to start. You get an Apache Spark error message.

Internal error message: Spark error: Driver down

You review the cluster driver and worker logs and see an error message containing java.io.FileNotFoundException: File file:/databricks/driver/dummy does not exist.

21/07/14 21:44:06 ERROR DriverDaemon$: XXX Fatal uncaught exception. Terminating driver.
java.io.FileNotFoundException: File file:/databricks/driver/dummy does not exist
   at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)
   at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)
   at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)
   at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
   at org.apache.spark.SparkContext.addFile(SparkContext.scala:1668)
   at org.apache.spark.SparkContext.addFile(SparkContext.scala:1632)
   at org.apache.spark.SparkContext$$anonfun$13.apply(SparkContext.scala:511)
   at org.apache.spark.SparkContext$$anonfun$13.apply(SparkContext.scala:511)
   at scala.collection.immutable.List.foreach(List.scala:392)

Cause

You have spark.files dummy set in your Spark Config, but no such file exists.

Spark interprets the dummy configuration value as a valid file path and tries to find it on the local file system. If the file does not exist, it generates the error message.

java.io.FileNotFoundException: File file:/databricks/driver/dummy does not exist

Solution

Option 1: Delete spark.files dummy from your Spark Config if you are not passing actual files to Spark.

Option 2: Create a dummy file and place it on the cluster. You can do this with an init script.

  1. Create the init script.
    %python
    dbutils.fs.put("dbfs:/databricks/<init-script-folder>/create_dummy_file.sh",
    """
    #!/bin/bash
    touch /databricks/driver/dummy""", True)
  2. Install the init script that you just created as a cluster-scoped init script.
    You will need the full path to the location of the script (dbfs:/databricks/<init-script-folder>/create_dummy_file.sh).
  3. Restart the cluster

Restart your cluster after you have installed the init script.