Problem
While creating a workflow using a JAR in your job, you notice you are unable to pass a param string value of more than 65,535 characters. You receive a message error: Error while emitting $iw UTF8 string too large
.
Cause
The Scala compiler embedded in the Scala REPL is designed to accept job parameters up to 65,535 characters. When the length of the parameters in the job param exceeds this limit, the compiler throws an error, causing the job to fail.
Solution
Pass the param using a .txt
file in Workspace FileSystem (WSFS).
1. Create a file in WSFS.
- Navigate to your desired location in the Databricks workspace.
- Create a new file, such as
params.txt
. - Add parameters to the file, such as
param1 = value1
. - Save the file.
2. Get the WSFS file path.
- Right-click the file and select Copy path.
- The path will look like
/Workspace/Users/your-username/params.txt
.
3. Modify your JAR to read parameters from a file.
- Update your main class to accept a file path as an argument.
- Implement logic to read and parse the file contents.
Example
package com.example.demo;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
public class DemoApplicationFileReader {
public static void main(String[] args) {
if (args.length == 0) {
System.out.println("Please provide a file path as an argument.");
return;
}
String filePath = args[0];
File file = new File(filePath);
if (!file.exists()) {
System.out.println("Error: The file at '" + filePath + "' does not exist.");
return;
}
if (!file.isFile()) {
System.out.println("Error: '" + filePath + "' is not a regular file.");
return;
}
try (BufferedReader reader = new BufferedReader(new FileReader(file))) {
String line;
System.out.println("Contents of file: " + file.getAbsolutePath());
System.out.println("-----------------------------");
while ((line = reader.readLine()) != null) {
System.out.println(line);
}
} catch (IOException e) {
System.out.println("An error occurred while reading the file:");
e.printStackTrace();
}
}
}
4. Create a Databricks job.
- Go to Workflows and create a new job.
- Set the job type to JAR.
- Specify the DBFS path to your JAR.
- In the Parameters field, enter the WSFS path to your parameter file.
5. Save the job configuration and run it.