Databricks Runtime is not able to read data in a format other than Delta

Delete the transactional log folder or move it to a different location.

Written by sidhant.sahu

Last published at: December 19th, 2024

Problem

When you attempt to read from a specified path using the format("parquet”), format(“cloudfiles”), or any non-Delta text format, you receive an error message. 
 

Error Stack trace:
"A transaction log for Delta was found at //_delta_log, but you are trying to read from /Volumes/<base-path>/<sub-dir-to-parquet>/ using format(<your-format>). You must use 'format("delta")' when reading and writing to a delta table."

 

Cause

A Delta table is present at the root level, and the query is trying to read from a subdirectory using your indicated format option.

When a Delta table is created, it generates a transaction log at the _delta_log location. When a query is executed, it checks if the target location is a Delta table or not. If a Delta table is found, it expects the format("delta") option to be used. The presence of a Delta table at the root level can occur when a user modifies the path or creates a Delta table at the root level unintentionally. This can lead to queries failing with the error message in the problem statement.

 

Solution

Remove the Delta table at the root level by deleting the _delta_log folder. 

 

dbutils.fs.rm("dbfs:/_delta_log/", True)

 

Important

Before making any changes, ensure you have a backup of your data to avoid any data loss.

 

 

If you are unable to delete the  _delta_log folder, you can instead move the transaction log to any different folder.  

 

To avoid similar issues in the future, follow these best practices. 

  • Verify the path and format options used in your queries to avoid conflicts with existing Delta tables.
  • Regularly review your data structure and clean up files and folders no longer needed.