Using MLflow API call to load a model taking the same amount of time every call and artifacts downloading from scratch

Save the model artifacts locally and then load the model from a local path.

Written by anshuman.sahu

Last published at: February 19th, 2025

Problem

When you load a model using the MLflow API call mlflow.<flavor>.load_model() repeatedly, you notice the calls keep taking the same amount of time for each call, and artifacts are downloaded from scratch each time. 

 

Cause

By default, MLflow retrieves and downloads model artifacts from the respective storage location each time they are called.

 

Solution

First, use mlflow.artifacts.download_artifacts() to save the model artifacts locally. 

 

import mlflow
model_uri = f"models:/{<your-model-name>}/{<version>}"
destination_path = "/local_disk0/model"
mlflow.artifacts.download_artifacts(artifact_uri=model_uri,dst_path=destination_path)

 

Then, load the model from the local path using mlflow.<flavor>.load_model() instead, allowing faster subsequent loads from local storage. 

 

model_uri = "/local_disk0/model"
mlflow.<flavor>.load_model(model_uri)

 

For more information, refer to the Mlflow API mlflow.artifacts documentation.