Multiple Apache Spark JAR jobs fail when run concurrently

Problem

If you run multiple Apache Spark JAR jobs concurrently, some of the runs might fail with the error:

org.apache.spark.sql.AnalysisException: Table or view not found: xxxxxxx; line 1 pos 48

Cause

This error occurs due to a bug in Scala. When an object extends App, its val fields are no longer immutable and they can be changed when the main method is called. If you run JAR jobs multiple times, a val field containing a DataFrame can be changed inadvertently.

As a result, when any one of the concurrent runs finishes, it wipes out the temporary views of the other runs. Scala issue 11576 provides more detail.

Solution

To work around this bug, call the main() method explicitly. As an example, if you have code similar to this:

  object MainTest extends App {
    ...
  }

You can replace it with code that does not extend App:

  object MainTest {
    def main(args: Array[String]) {
    ......
    }
  }