Problem
When working in a notebook on a cluster configured to use a Docker container, you register your models using MLflow and notice they save to a local path (such as a workspace path).
However, you want the model to save to Machine Learning > Models, as shown in the following image.
Cause
Your Docker container-configured cluster’s tracking URI is set to the workspace location.
Solution
Databricks recommends managing your model lifecycle in Unity Catalog. For more information, refer to the Manage model lifecycle in Unity Catalog (AWS | Azure) documentation.
To set a model-tracking URI to Unity Catalog, you can use the following code.
Code for testing
import os
import mlflow
db_host = "<your-databricks-host-URL>"
db_token = "<PAT TOKEN>"
mlflow.set_tracking_uri('databricks-uc')
os.environ["DATABRICKS_HOST"] = <your-db-host>
os.environ["DATABRICKS_TOKEN"] = <your-db-token>
Code for production
In production, store the PAT token in a secret and use dbutils
to get the secret.
import os
import mlflow
db_host = "<your-databricks-host-URL>"
db_token = dbutils.secrets.get("<your-scope>", "<your-key>")
mlflow.set_tracking_uri('databricks-uc')
os.environ["DATABRICKS_HOST"] = <your-db-host>
os.environ["DATABRICKS_TOKEN"] = <your-db-token>
Alternatively, you can manage your model lifecycle using the Workspace Model Registry. For more information, refer to the Manage model lifecycle using the Workspace Model Registry (legacy) (AWS | Azure) documentation.
Use the same code to set a model-tracking URI to Databricks, but change the line mlflow.set_tracking_uri('databricks-uc')
to mlflow.set_tracking_uri('databricks')
.