Problem
Your streaming job using Auto Loader fails with one of the following errors.
Error type 1
org.rocksdb.RocksDBException: While open a file for random read: /local_disk0/tmp/spark-<uuid>.sst: Too many open files
Error type 2
org.rocksdb.RocksDBException: While opendir: /local_disk0/tmp/spark-<uuid>: Too many open files
Error type 3
org.rocksdb.RocksDBException: While open a file for appending: /local_disk0/tmp/spark-<uuid>/MANIFEST-xxxx: Too many open files
Cause
The operating system has a file descriptor limit. When multiple Auto Loader streams are concurrently active on a single cluster, they put cumulative pressure on that limit.
Context
Structured streaming uses a RocksDB engine in features to manage stream progress, such as Auto Loader and state store for stateful streaming. RocksDB persists versioning state by writing key-value pairs into immutable SST files stored on disk, typically at /local_disk0/tmp/spark-<uuid>
.
As streaming data flows in, these SST files are frequently created, compacted, and rewritten. Under heavy streaming workloads, especially when multiple streams are active, the volume of SST files rapidly grows.
This growth results in the operating system exhausting the limit on simultaneously open file descriptors (ulimit
), particularly when the count of unique open SST files across all RocksDB instances on the node approaches or exceeds the default hard limit of 1 million files per root user.
Solution
Make changes to Apache Spark configurations applicable to Auto Loader, state store, or both. If necessary, also consider splitting a large number of streams across multiple job clusters, which will reduce the overhead of the number of open files across multiple streams.
Auto Loader configuration change
Use the readStream
option cloudFiles.maxFileAge
to decrease the retention period of discovered files. Decreasing the retention period lowers the SST file count, reducing the RocksDB storage footprint per Auto Loader stream.
For more information, refer to the “Event retention” section of the Configure Auto Loader for production workloads (AWS | Azure | GCP) documentation.
State store configuration changes
In a notebook, or in your cluster settings under Advanced options > Spark in the Spark config field:
- Set the
spark.sql.streaming.stateStore.rocksdb.compactOnCommit
totrue
to enable performing a range compaction of RocksDB instance for commit operation. - Set the
spark.sql.streaming.stateStore.rocksdb.maxOpenFiles
to10000
or20000
. This configuration is used to control the maximum number of open files allowed by RocksDB.
Auto Loader and state store configuration change
In a notebook, or in your cluster settings under Advanced options > Spark in the Spark config field, lower the configuration spark.databricks.rocksDB.fileManager.compactSmallFilesThreshold
to 8
. (The default is 16
).
This configuration controls how many <1MB
files can be in any level of a RocksDB-based checkpoint before a compaction will be triggered. Reducing it will trigger compaction more frequently and reduce the number of SST files.