MlflowClient().search_runs returns only a subset of runs

Use mlflow.search_runs() instead of MlflowClient.search_runs().

Written by G Yashwanth Kiran

Last published at: April 26th, 2025

Problem

The MlflowClient.search_runs() function in MLflow searches for runs that match a given set of criteria within specified experiments. You can use this to retrieve and order runs based on specific metrics, parameters, or tags. When you try to use MlflowClient.search_runs() in your current notebook, it only returns a subset of the runs instead of the actual number of runs.

For example, an experiment with more than 300 runs may only return 17 rows.

 

Example code

%python

import mlflow
from mlflow.tracking.client import MlflowClient

# Use the experiment notebook path to get the experiment ID
experiment_name = '<full-path-to-notebook>'
Client = MlflowClient()
experiment_id= client.get_experiment_by_name(experiment_name).experiment_id
runs_count = len(client.search_runs(experiment_ids=experiment_id)) 

 

When you read the value of runs_count it is less than the actual number of runs in the experiment.

 

The example experiment UI only shows 17 runs.

 

Cause 

The MlflowClient.search_runs() method is based on pagination, which means it might not return all runs in a single call but rather the runs present on a single page of an experiment.

 

Solution

To ensure an accurate run count, you can use one of two approaches.

 

Use mlflow.search_runs()

Use mlflow.search_runs() instead of MlflowClient.search_runs().

For this approach, you must first use MlflowClient to get the experiment_id through the notebook path associated with the experiment.

After you have the experiment_id you can use it with the mlflow.search_runs() function which returns the total number of runs as expected.

 

Example code

Replace <full-path-to-notebook> with the full path to the notebook associated with the experiment before running this example code.

%python

import mlflow
from mlflow.tracking.client import MlflowClient

# Use the experiment notebook path to get the experiment ID
experiment_name = '<full-path-to-notebook>'
Client = MlflowClient()
experiment_id= client.get_experiment_by_name(experiment_name).experiment_id

runs = mlflow.search_runs(experiment_ids=experiment_id)
runs_count = len(runs)

 

 

Include extra parameters when calling MlflowClient.search_runs()

To handle pagination correctly when using the search_runs method in the MlflowClient, you need to ensure that you are iterating through all the pages of results. 

  1. Initialize an empty list runs to store the results and set page_token to None
  2. Use a while loop to continuously call search_runs() until all pages are retrieved. In each iteration, call search_runs with the necessary parameters:
    • experiment_ids - List of experiment IDs to search within.
    • <max-results> - Maximum number of results to return per page (e.g., 100).
    • page_token - Token for the next page of results (initially None).
       
  1. Extend the runs list with the results from the current page. If result.token is None, it means there are no more pages to retrieve, and the loop can be exited. Otherwise, set page_token to result.token to retrieve the next page in the subsequent iteration.

Example code

Replace <full-path-to-notebook> with the full path to the notebook associated with the experiment and set the <max-results> value before running this example code.

%python

import mlflow
from mlflow.tracking.client import MlflowClient

# Use the experiment notebook path to get the experiment ID
experiment_name = '<full-path-to-notebook>'
Client = MlflowClient()
experiment_id= client.get_experiment_by_name(experiment_name).experiment_id

page_token = None
runs = []
while True:
    result = client.search_runs(
        experiment_ids=[experiment_id],
        max_results=<max-results>,
        page_token=page_token
    )
    runs.extend(result)
    if not result.token:
        break
    page_token = result.token
print(len(runs))