Unable to pass a param string value of more than 65,535 characters in a workflow using a JAR in a job

Pass the param using a text file in Workspace FileSystem (WSFS).

Written by shubham.bhusate

Last published at: October 14th, 2024

Problem 

While creating a workflow using a JAR in your job, you notice you are unable to pass a param string value of more than 65,535 characters. You receive a message error: Error while emitting $iw UTF8 string too large

Cause

The Scala compiler embedded in the Scala REPL is designed to accept job parameters up to 65,535 characters. When the length of the parameters in the job param exceeds this limit, the compiler throws an error, causing the job to fail.

Solution

Pass the param using a .txt file in Workspace FileSystem (WSFS).  

1. Create a file in WSFS. 

  • Navigate to your desired location in the Databricks workspace.
  • Create a new file, such as params.txt.
  • Add parameters to the file, such as param1 = value1.
  • Save the file.

2. Get the WSFS file path. 

  • Right-click the file and select Copy path.
  • The path will look like /Workspace/Users/your-username/params.txt.

3. Modify your JAR to read parameters from a file. 

  • Update your main class to accept a file path as an argument.
  • Implement logic to read and parse the file contents.

Example 

package com.example.demo;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
public class DemoApplicationFileReader {
   public static void main(String[] args) {
       if (args.length == 0) {
           System.out.println("Please provide a file path as an argument.");
           return;
       }
       String filePath = args[0];
       File file = new File(filePath);
       if (!file.exists()) {
           System.out.println("Error: The file at '" + filePath + "' does not exist.");
           return;
       }
       if (!file.isFile()) {
           System.out.println("Error: '" + filePath + "' is not a regular file.");
           return;
       }
       try (BufferedReader reader = new BufferedReader(new FileReader(file))) {
           String line;
           System.out.println("Contents of file: " + file.getAbsolutePath());
           System.out.println("-----------------------------");
           while ((line = reader.readLine()) != null) {
               System.out.println(line);
           }
       } catch (IOException e) {
           System.out.println("An error occurred while reading the file:");
           e.printStackTrace();
       }
   }
}

 

4. Create a Databricks job. 

  • Go to Workflows and create a new job.
  • Set the job type to JAR.
  • Specify the DBFS path to your JAR.
  • In the Parameters field, enter the WSFS path to your parameter file.

5. Save the job configuration and run it.