Problem
When you try to stream a job that reads from Kinesis using Enhanced Fan-Out (EFO) mode, the job fails with an error message.
shadedkinesis.software.amazon.awssdk.services.kinesis.model.ResourceInUseException: Another active subscription exists for this consumer: <consumer>:common-jobs:shardId-xxx:databricks_xxxxxx (Service: Kinesis, Status Code: 400, Request ID: <request-id>, Extended Request ID:<extended-request-id>
Cause
You have maxFetchDuration
set to < 5s
, which is too short.
The Databricks Kinesis connector works in two phases:
- In the prefetch phase, an Apache Spark job fetches data from Kinesis and stores it in executor block managers, bounded by
maxFetchDuration
or record count. - The micro-batch processing phase uses prefetched data or re-fetches if blocks are missing.
In polling mode, the default maxFetchDuration
is 10s
, and it can be lowered to allow frequent prefetches without major impact (aside from rate limits).
When using EFO mode, AWS retains a shard + consumer subscription for at least 5s
even after cancellation. If maxFetchDuration
< 5s
, the next prefetch job will be blocked trying to take over the shard subscription, leading to retries, increased task time, and higher stream latency.
For more information, refer to the AWS SubscribeToShard documentation.
Solution
Set maxFetchDuration
to 5s
while using EFO mode to fix this issue with best-case latency.