Problem
When working with tables in Databricks, you want to filter columns based on specific strings in their comments. For example, you might want to classify columns as “internal” and then pull all columns excluding those classified as “internal” into a BI tool using an SQL query.
Cause
There is not a direct SQL method to filter columns based on their comments. However, Databricks provides the DESCRIBE TABLE
command, which you can leverage.
Solution
Use the following SQL code to retrieve column metadata, including comments, and then filter columns programmatically.
%sql
WITH column_info AS (
DESCRIBE TABLE <your-table-name>
),
filtered_columns AS (
SELECT col_name
FROM column_info
WHERE comment NOT LIKE '%internal%'
)
SELECT
(SELECT STRING_AGG(col_name, ', ') FROM filtered_columns)
FROM <your-table-name>;
To verify the solution works, after executing the constructed SQL query check that the expected columns are returned without the ones containing the specific string in their comments.