SSL error when invoking Databricks model serving endpoint

Reduce or split the payload size.

Written by ismael.khalique

Last published at: May 23rd, 2025

Problem

When querying a Databricks model serving endpoint, you encounter an error indicating the TLS connection was unexpectedly terminated during the HTTPS request to the model serving endpoint. 

SSL Error: HTTPSConnectionPool(host='<workspace-name>.cloud.databricks.com', port=443): Max retries exceeded with url: /serving-endpoints/gpt2-endpoint/invocations (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:2406)')))

 

Cause

The request payload exceeds the maximum size limits enforced by the Databricks model serving infrastructure.  

 

Currently, the payload size limit is 16 MB per request for custom models. For endpoints serving foundation models, external models, or AI agents, the limit is 4 MB per request. 

 

Payloads exceeding these thresholds may result in backend rejection or abrupt TLS termination, leading to SSL-related errors.

 

Solution

Reduce input payload size

  1. Ensure your serialized JSON input (inputs, params, and so on) is below 16 MB, or 4 MB for foundation model endpoints.
  2. Split large inputs, such as large documents or long token sequences, into smaller parts if needed.

 

Check payload size before sending

To avoid request failures, consider adding a pre-check in your client code to verify that the payload does not exceed the limit. You can raise a ValueError if the payload size surpasses the allowed threshold, as shown in the following example.

data = {
   "inputs": [large_payload],
   "params": {"max_new_tokens": 10, "temperature": 1}
}
encoded_payload = json.dumps(data).encode("utf-8")
if len(encoded_payload) > 16_000_000:
   raise ValueError("Payload exceeds 16MB limit. Reduce input size.")

 

For additional information, review the Model Serving limits and regions (AWSAzureGCP) documentation.