Streaming job using Auto Loader failing with error “org.rocksdb.RocksDBException: Too many open files”

Make changes to Apache Spark configurations applicable to Auto Loader, state store, or both.

Written by jayant.sharma

Last published at: May 19th, 2025

Problem

Your streaming job using Auto Loader fails with one of the following errors. 

 

Error type 1

org.rocksdb.RocksDBException: While open a file for random read: /local_disk0/tmp/spark-<uuid>.sst: Too many open files

 

Error type 2

org.rocksdb.RocksDBException: While opendir: /local_disk0/tmp/spark-<uuid>: Too many open files

 

Error type 3

org.rocksdb.RocksDBException: While open a file for appending: /local_disk0/tmp/spark-<uuid>/MANIFEST-xxxx: Too many open files

 

Cause

The operating system has a file descriptor limit. When multiple Auto Loader streams are concurrently active on a single cluster, they put cumulative pressure on that limit.

 

Context

Structured streaming uses a RocksDB engine in features to manage stream progress, such as Auto Loader and state store for stateful streaming. RocksDB persists versioning state by writing key-value pairs into immutable SST files stored on disk, typically at /local_disk0/tmp/spark-<uuid>.

As streaming data flows in, these SST files are frequently created, compacted, and rewritten. Under heavy streaming workloads, especially when multiple streams are active, the volume of SST files rapidly grows. 

This growth results in the operating system exhausting the limit on simultaneously open file descriptors (ulimit), particularly when the count of unique open SST files across all RocksDB instances on the node approaches or exceeds the default hard limit of 1 million files per root user.

 

Solution

Make changes to Apache Spark configurations applicable to Auto Loader, state store, or both. If necessary, also consider splitting a large number of streams across multiple job clusters, which will reduce the overhead of the number of open files across multiple streams.

 

Auto Loader configuration change

Use the readStream option cloudFiles.maxFileAge to decrease the retention period of discovered files. Decreasing the retention period lowers the SST file count, reducing the RocksDB storage footprint per Auto Loader stream.

 

For more information, refer to the “Event retention” section of the Configure Auto Loader for production workloads (AWSAzureGCP) documentation.

 

State store configuration changes 

In a notebook, or in your cluster settings under Advanced options > Spark in the Spark config field: 

  • Set the spark.sql.streaming.stateStore.rocksdb.compactOnCommit to true to enable performing a range compaction of RocksDB instance for commit operation.
  • Set the spark.sql.streaming.stateStore.rocksdb.maxOpenFiles to 10000 or 20000. This configuration is used to control the maximum number of open files allowed by RocksDB. 

 

Auto Loader and state store configuration change

In a notebook, or in your cluster settings under Advanced options > Spark in the Spark config field, lower the configuration spark.databricks.rocksDB.fileManager.compactSmallFilesThreshold to 8. (The default is 16). 

 

This configuration controls how many <1MB files can be in any level of a RocksDB-based checkpoint before a compaction will be triggered. Reducing it will trigger compaction more frequently and reduce the number of SST files.