Problem
When serving a model trained using Databricks AutoML, you notice the model loads and runs inference correctly in a notebook, but fails when deployed to an endpoint with the following error.
Failed to deploy modelName: served entity creation aborted because the endpoint update timed out. Please see service logs for more information.
When you check the service logs, you see an additional error.
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected XX from C header, got <less-than-XX> from PyObject
Cause
The model environment includes a dependency such as pandas versions below 2.2.2, which is not compatible with NumPy 2.0.0 or above.
However, MLflow does not automatically log a constraint like NumPy versions below 2.0.0, leading to an environment where pandas versions below 2.2.2 and NumPy version 2.0.0 or above can coexist.
This version mismatch causes binary incompatibility during model serving, resulting in the observed error.
Solution
Ensure the model environment includes an explicit version pin for NumPy, such as numpy==<version>
, where <version>
is any version compatible with the pandas version mentioned in the dependency files.
Manually add this version constraint to the model’s conda.yaml
and requirements.txt
files, then re-upload the updated files as artifacts to the same MLflow run.
1. Identify the run ID for the model you want to serve.
2. Use the following script to add a NumPy pin. The following script:
- Downloads the model’s existing
conda.yaml
andrequirements.txt
. - Checks if a NumPy version is pinned.
- If not, it adds
numpy==<current-version>
based on the local environment. - Uploads the updated files back as artifacts to the same run.
import mlflow
import os
import shutil
import tempfile
import yaml
import numpy as np # Import numpy to access its version
from mlflow.tracking import MlflowClient
client = mlflow.tracking.MlflowClient()
# Create a temporary directory to work with artifacts
tmp_dir = tempfile.mkdtemp()
run_id = "<your-run-id>" # Replace with your run id
try:
# Download and process conda.yaml
conda_artifact_path = f"runs:/{run_id}/model/conda.yaml"
conda_file_path = mlflow.artifacts.download_artifacts(artifact_uri=conda_artifact_path, dst_path=tmp_dir)
with open(conda_file_path, 'r') as file:
conda_config = yaml.safe_load(file)
# Check if numpy is listed under pip dependencies
pip_dependencies = conda_config.get("dependencies", [])
pip_section = next((dep for dep in pip_dependencies if isinstance(dep, dict) and "pip" in dep), None)
numpy_in_conda = False
if pip_section:
numpy_in_conda = any(pkg.startswith("numpy==") for pkg in pip_section["pip"])
if not numpy_in_conda:
numpy_version = np.__version__
print(f"Adding numpy=={numpy_version} to conda.yaml")
if not pip_section:
# If there's no pip section, create one
pip_section = {"pip": []}
conda_config["dependencies"].append(pip_section)
pip_section["pip"].append(f"numpy=={numpy_version}")
# Write the updated conda.yaml back to the file
with open(conda_file_path, 'w') as file:
yaml.dump(conda_config, file)
# Log the updated conda.yaml back to MLflow
client.log_artifact(run_id=run_id, local_path=conda_file_path, artifact_path="model")
# Download and process requirements.txt
req_artifact_path = f"runs:/{run_id}/model/requirements.txt"
req_file_path = mlflow.artifacts.download_artifacts(artifact_uri=req_artifact_path, dst_path=tmp_dir)
with open(req_file_path, 'r') as file:
requirements = [line.strip() for line in file.readlines()]
numpy_in_requirements = any(pkg.startswith("numpy==") for pkg in requirements)
if not numpy_in_requirements:
numpy_version = np.__version__
print(f"Adding numpy=={numpy_version} to requirements.txt")
requirements.append(f"numpy=={numpy_version}")
# Write the updated requirements.txt back to the file
with open(req_file_path, 'w') as file:
file.write("\n".join(requirements))
# Log the updated requirements.txt back to MLflow
client.log_artifact(run_id=run_id, local_path=req_file_path, artifact_path="model")
finally:
# Clean up the temporary directory
shutil.rmtree(tmp_dir)
3. After updating the artifacts, redeploy the endpoint to ensure consistent environments and prevent binary incompatibility errors.
For more information on supported formats for Mlflow.artifacts.download_artifacts
, refer to the MLflow mlflow.artifacts API documentation.