Problem
You’re attempting to write data from a Delta table to an Event Hub in a streaming job. The job fails without sending any data to the Event Hub and your event log shows multiple timeout messages.
Caused by: com.microsoft.azure.eventhubs.TimeoutException: Entity(XXXXXX): Send operation timed out
Cause
The maxBytesPerTrigger
option, which controls the batch size in bytes, is not set by default. Without this option, Apache Spark uses the default maxFilesPerTrigger
, which is set to 1000
.
Solution
Set maxBytesPerTrigger
according to your Event Hub tier rate limit. For details on limits, review the Azure Event Hubs quotas and limits documentation.
Note
If you continue using maxFilesPerTrigger
along with maxBytesPerTrigger
, your job respects whichever comes first.
For more information, review the Configure Structured Streaming batch size on Databricks (AWS | Azure | GCP) documentation.