Streaming job reading from Kinesis using EFO mode fails with ResourceInUseException error

Set maxFetchDuration to 5 seconds.

Written by anudeep.konaboina

Last published at: April 16th, 2025

Problem

When you try to stream a job that reads from Kinesis using Enhanced Fan-Out (EFO) mode, the job fails with an error message.

shadedkinesis.software.amazon.awssdk.services.kinesis.model.ResourceInUseException: Another active subscription exists for this consumer: <consumer>:common-jobs:shardId-xxx:databricks_xxxxxx (Service: Kinesis, Status Code: 400, Request ID: <request-id>, Extended Request ID:<extended-request-id>

 

Cause

You have maxFetchDuration set to < 5s, which is too short. 

The Databricks Kinesis connector works in two phases:

  1. In the prefetch phase, an Apache Spark job fetches data from Kinesis and stores it in executor block managers, bounded by maxFetchDuration or record count.
  2. The micro-batch processing phase uses prefetched data or re-fetches if blocks are missing.

 

In polling mode, the default maxFetchDuration is 10s, and it can be lowered to allow frequent prefetches without major impact (aside from rate limits).

When using EFO mode, AWS retains a shard + consumer subscription for at least 5s even after cancellation. If maxFetchDuration < 5s, the next prefetch job will be blocked trying to take over the shard subscription, leading to retries, increased task time, and higher stream latency.

For more information, refer to the AWS SubscribeToShard documentation.

 

Solution

Set maxFetchDuration to 5s while using EFO mode to fix this issue with best-case latency.