Hive-style partitions not found on Delta table after enabling column mapping mode

Delta Lake column mapping does not support Hive-style partitions.

Last published at: February 21st, 2024

Problem

You want to partition your Delta table on the date value. This creates subfolders for each partition, in the root path of the Delta table.

For example, date=2023-01-01, date=2023-01-02, etc.

You enable Delta Lake column mapping, but when you try to list the subfolders, the names are not what you expect (date=2023-01-01) because those date partitions are no longer available.

Instead, you see subfolders, with random names, that do not appear to be a valid date partition. For example, you see partitions like date=xx.

Cause

When Delta Lake column mapping is enabled on a table, it uses random file prefixes, and removes the ability to explore data using Hive-style partitioning.

Solution

This is expected behavior when column mapping is enabled. You can still query your data. In this example, if you want to retrieve data from a single date, you should apply a filter in your where clause.

For more information, please review the Do Delta Lake and Parquet share partitioning strategies? section in the When to partition tables on Databricks documentation (AWS | Azure | GCP).

Databricks Help Center

Problem

Cause

Solution

Contact Us