Problem
Applying column masks directly within nested fields (for example, inside STRUCT
) in Delta Live Tables (DLT) pipelines, streaming tables, or materialized views fails with a parsing error.
org.apache.spark.sql.catalyst.parser.ParseException: [PARSE_SYNTAX_ERROR]
Syntax error at or near 'MASK'. SQLSTATE: 42601
Additional context
The following code creates a table and applies a mask to a top-level column, which works.
%sql
-- Define a simple mask that hides a string from non-admins
CREATE OR REPLACE FUNCTION demo.street_mask(col STRING)
RETURNS STRING
RETURN CASE
WHEN is_member('admin') THEN col
ELSE 'MASKED'
END;
%sql
-- Create a table and attach the mask to the top-level column
CREATE OR REPLACE TABLE customer_addresses (
customer_id STRING,
street STRING MASK demo.street_mask,
city STRING,
updated_at TIMESTAMP
);
The following code then applies the same mask to a nested column, street column, resulting in the PARSE_SYNTAX_ERROR
.
%sql
-- Attempt to use the same mask on a field inside a STRUCT
CREATE OR REFRESH STREAMING LIVE TABLE bad_mask_example (
customer_id STRING,
address STRUCT<
location STRUCT<
street STRING MASK demo.street_mask, -- ← error here
city STRING
>,
updated_at TIMESTAMP
>
);
Cause
Databricks does not support attaching a column mask to an individual field nested inside a STRUCT
.
As the parser treats the entire STRUCT
as one column, trying to mask a single inner field violates the SQL grammar and triggers the PARSE_SYNTAX_ERROR
.
Solution
Apply a mask to the entire STRUCT
column, and have the masking function inspect the field of interest.
1. Define a column mask function to accommodate the STRUCT
type. The mask function inspects the entire STRUCT
column, and within that function, it inspects the fields inside the nested STRUCT
. The function applies a masking logic based on the field.
%sql
-- Returns the original struct for admins; for unauthorized users, masks street
CREATE OR REPLACE FUNCTION demo.address_mask(
addr STRUCT<location: STRUCT<street: STRING, city: STRING>>
)
RETURNS STRUCT<location: STRUCT<street: STRING, city: STRING>>
RETURN
CASE
WHEN is_member('admin') THEN addr
ELSE STRUCT(
STRUCT(
'MASKED' AS street,
addr.location.city AS city
) AS location
)
END;
2. Create or refresh the table with the mask applied to the entire STRUCT
.
%sql
CREATE OR REFRESH STREAMING LIVE TABLE customer_events (
customer_id STRING,
event_time TIMESTAMP,
-- Whole-struct masking
address STRUCT<
location STRUCT<
street STRING,
city STRING
>
> MASK demo.address_mask
USING COLUMNS ('prod'),
_START_AT TIMESTAMP,
_END_AT TIMESTAMP
);