Problem
When integrating Kafka with Apache Spark Structured Streaming in Databricks, you encounter a timeout error, such as the following example.
ERROR KafkaOffsetReaderAdmin: Error in attempt 1 getting Kafka offsets:
java.util.concurrent.ExecutionException: kafkashaded.org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: describeTopics
Cause
You’re using Spark's Kafka connector (spark-sql-kafka-0-10
) with Kafka brokers older than version 0.10.0.
The spark-sql-kafka-0-10
connector used in Structured Streaming requires Kafka brokers 0.10.0 or above. It initiates an ApiVersionRequest
handshake to determine broker capabilities.
Further, broker versions below 0.10.0 (such as 0.8.2) do not support ApiVersionRequest
. The broker either ignores the request or closes the connection, causing client timeouts during metadata operations like describeTopics
.
Solution
Databricks recommends upgrading your Kafka brokers to 0.10.0 or above to enable ApiVersionRequest
support. This aligns with Spark's protocol requirements.
If you are unable to upgrade, you can use the legacy Spark Connector. Replace spark-sql-kafka-0-10
with the older spark-streaming-kafka-0-8
connector.