Error when trying to run an Auto Loader job that uses cloudFiles

Set spark.databricks.cloudFiles.checkSourceChanged to false.

Written by sidhant.sahu

Last published at: January 31st, 2025

Problem

While trying to run an Auto Loader job that uses cloudFiles, you encounter an error in your Databricks environment related to a mismatch between the S3 buckets.

 

```
Error in stream_microbatch_processing: [STREAM_FAILED] Query [id = XXX, runId = XXX] terminated with exception: The bucket in the file event `{"create":{"bucket":"<your-bucket-name>",...` is different from expected by the source: `<your-other-bucket-name>`.

 

Cause

There has been a change in the source directory for the streaming job.

 

Solution

  1. Navigate to your cluster and open the settings.
  2. Click Advanced options
  3. Under the Spark tab, in the Spark config box, enter the following code. 

 

spark.databricks.cloudFiles.checkSourceChanged false

 

  1. Restart the job with the new checkpoint directory when you set this configuration for the first time.