Streaming job not sending data to EventHub and failing with TimeoutException message

Set the maxBytesPerTrigger according to the Event Hub rate limit.

Written by Raphael Freixo

Last published at: December 12th, 2024

Problem

You’re attempting to write data from a Delta table to an Event Hub in a streaming job. The job fails without sending any data to the Event Hub and your event log shows multiple timeout messages.

 

Caused by: com.microsoft.azure.eventhubs.TimeoutException: Entity(XXXXXX): Send operation timed out 

 

 

Cause

The maxBytesPerTrigger option, which controls the batch size in bytes, is not set by default. Without this option, Apache Spark uses the default maxFilesPerTrigger, which is set to 1000.

 

Solution

Set maxBytesPerTrigger according to your Event Hub tier rate limit. For details on limits, review the Azure Event Hubs quotas and limits documentation.  

 

Note

If you continue using maxFilesPerTrigger along with maxBytesPerTrigger,  your job respects whichever comes first.

 

 

For more information, review the Configure Structured Streaming batch size on Databricks (AWSAzureGCP) documentation.