Problem
When attempting to log a model with MLflow in a PySpark pipeline, you encounter an assertion error related to the TempDir class in MLflow.
An error occurred during model logging:
%s Traceback (most recent call last):
File "/databricks/python/lib/python3.10/site-packages/retail_sales_data_product/training/transform.py", line 266, in log_model
mlflow.xgboost.log_model(
File "/databricks/python/lib/python3.10/site-packages/mlflow/xgboost/__init__.py", line 270, in log_model
return Model.log(
File "/databricks/python/lib/python3.10/site-packages/mlflow/models/model.py", line 620, in log
with TempDir() as tmp:
File "/databricks/python/lib/python3.10/site-packages/mlflow/utils/file_utils.py", line 426, in __exit__
assert os.path.exists(os.getcwd())
AssertionError
Ending MLflow run
Cause
MLflow is attempting to verify the current working directory’s existence but the working directory has become invalid.
Solution
Upgrade your MLflow version to 2.16.0 or higher.
Alternatively, you can upgrade your Databricks runtime to version 13.3 LTS or above, which comes with the latest version of MLflow.
For more detail on pre-installed library versions in Databricks Runtime, please refer to the Databricks Runtime release notes versions and compatibility (AWS | Azure | GCP) documentation.