PySpark merge operation with 'withSchemaEvolution' fails on serverless compute

Use a job or all-purpose compute in Databricks Runtime 15.4 LTS, or use SQL.

Written by potnuru.siva

Last published at: July 8th, 2025

Problem

When running an Apache Spark PySpark job workflow on a serverless compute in serverless environment version 1, you try to conduct a MERGE operation with withSchemaEvolution using Delta Lake and receive the following error.

Error message: AttributeError: 'DeltaMergeBuilder' object has no attribute 'withSchemaEvolution'

 

This issue occurs despite using a Databricks Runtime version that supports schema evolution (15.4 LTS and above).

 

Cause

Certain Apache Spark configurations, including those required for schema evolution (spark.databricks.delta.schema.autoMerge.enabled), are not supported in serverless compute version 1 environments. As a result, the withSchemaEvolution method, which relies on these configurations, is also not supported. 

 

For more information, refer to the Serverless compute limitations (AWSAzureGCP) documentation. 

 

To review the Spark configs supported in serverless, refer to the “​​Configure Spark properties for serverless notebooks and jobs” section of the Set Spark configuration properties on Databricks (AWSAzureGCP) documentation.

 

For more information about schema evolution, review the “Schema evolution syntax for merge” section of the Update Delta Lake table schema (AWSAzureGCP) documentation.

 

Solution

Either use a job compute or all-purpose compute instead of serverless, use SQL to perform the MERGE operation with schema evolution, or use serverless environment version 2 or above. 

 

Use a job compute or all-purpose compute 

Instead of using serverless compute, switch to a job cluster or an all-purpose cluster with Databricks Runtime 15.4 LTS and above where the withSchemaEvolution method is supported.  

 

This involves changing the compute configuration for your Databricks job workflow. Please refer to the Compute configuration reference (AWSAzureGCP) documentation for more information.

 

Use SQL to perform the MERGE operation with schema evolution 

Alternatively, use SQL to perform the MERGE operation with schema evolution. You can execute the operation directly in a SQL cell or in PySpark using spark.sql(<query>)

 

Direct SQL example

   %sql
   MERGE WITH SCHEMA EVOLUTION INTO <target-table-name> t
   USING source s
   ON s.id = t.id
   WHEN MATCHED THEN
   UPDATE SET *
   WHEN NOT MATCHED THEN
   INSERT *

 

SQL in PySpark example 

   spark.sql("""
       MERGE WITH SCHEMA EVOLUTION INTO <target-table-name> t
       USING source s
       ON s.id = t.id
       WHEN MATCHED THEN
       UPDATE SET *
       WHEN NOT MATCHED THEN
       INSERT *
   """)

 

Use serverless environment version 2 or above

For more information, refer to the Serverless environment versions (AWS | Azure | GCP) documentation.