Job execution - Databricks

Increase the number of tasks per stage

Learn how to increase the number of tasks per stage when using the spark-xml package with Databricks....

Last updated: May 11th, 2022 by Adam Pavlacka

Members of a Gmail group email not receiving notifications

Allow external entities to email the group inbox. ...

Last updated: September 18th, 2024 by walter.camacho

Apache Spark jobs failing due to stage failure when using spot instances in a cluster

Use on-demand nodes instead of spot instances. ...

Last updated: August 11th, 2025 by Vidhi Khaitan

Broadcast join hash not being used despite hints

Refresh table statistics or use supported joins that allow for broadcast join....

Last updated: January 25th, 2025 by swetha.nandajan

Recurring error “Unable to get field from serde” when trying to perform operations on table

Verify your metadata using the Glue API, then recreate or update the table with the correct metadata. ...

Last updated: January 25th, 2025 by swetha.nandajan

Error compressed buffer size exceeds 2 GB when saving data

Set the Apache Spark configs to increase the frequency of row group size check....

Last updated: January 29th, 2025 by manikandan.ganesan

Error when trying to use RDD code in shared clusters

Use a single-user cluster, which supports RDD functionality....

Last updated: January 31st, 2025 by mounika.tarigopula

File corruption error on Apache Spark Streaming jobs during file processing in DBFS

Replace dbutils.fs operations with Hadoop filesystem methods. ...

Last updated: January 31st, 2025 by swetha.nandajan

Jobs failing at data shuffle stage with error org.apache.spark.shuffle.FetchFailedException

Analyze the shuffle data distribution across executors and join query strategies....

Last updated: January 31st, 2025 by swetha.nandajan

Apache Spark job output only giving the first JSON object instead of all records

Add appropriate line breaks between each JSON object or use Photon....

Last updated: January 31st, 2025 by swetha.nandajan

DAB job parameters not passing in correctly on the task level

Correct the syntax....

Last updated: February 26th, 2025 by daniel.ruiz

Recurring Apache Spark jobs with same data set size and cluster configuration vary in duration

Build your cluster with sufficient SSD memory, monitor your cluster’s disk usage, and optimize data storage....

Last updated: March 12th, 2025 by John Benninghoff

Resolve invalid cast input error on serverless compute

Set spark.sql.ansi.enabled to false to resolve casting errors on serverless compute....

Last updated: April 16th, 2025 by anudeep.konaboina

Seeing slow-running jobs while adaptive parallelism enabled

Disable adaptive parallelism....

Last updated: July 10th, 2025 by Raghavan Vaidhyaraman

Apache Spark jobs fail or stall when importing .whl from /Workspace on multi-node clusters

Allow all traffic on ports 1017,1015 and 1021....

Last updated: August 22nd, 2025 by anudeep.konaboina

Delta table deletion operations intermittently return empty results when data exists

Update your cluster configuration to exclude the optimizer rule....

Last updated: October 7th, 2025 by aishwarya.ghosh

Task count performance while using serverless compute not showing consistently

Absence of serverless task count performance information at times is expected....

Last updated: October 14th, 2025 by Raghavan Vaidhyaraman

FileNotFound Exception when using Apache Spark addFile on multi-node compute

Read directly from Databricks File System (DBFS) for Spark parallelization....

Last updated: October 14th, 2025 by anudeep.konaboina

Permission denied error on jobs integrated with Git repositories triggered by service principals

Set up the correct Git credentials for the service principal. ...

Last updated: October 15th, 2025 by vidya.sagamreddy

Databricks Help Center

Increase the number of tasks per stage

Members of a Gmail group email not receiving notifications

Apache Spark jobs failing due to stage failure when using spot instances in a cluster

Broadcast join hash not being used despite hints

Recurring error “Unable to get field from serde” when trying to perform operations on table

Error compressed buffer size exceeds 2 GB when saving data

Error when trying to use RDD code in shared clusters

File corruption error on Apache Spark Streaming jobs during file processing in DBFS

Jobs failing at data shuffle stage with error org.apache.spark.shuffle.FetchFailedException

Apache Spark job output only giving the first JSON object instead of all records

DAB job parameters not passing in correctly on the task level

Recurring Apache Spark jobs with same data set size and cluster configuration vary in duration

Resolve invalid cast input error on serverless compute

Seeing slow-running jobs while adaptive parallelism enabled

Apache Spark jobs fail or stall when importing .whl from /Workspace on multi-node clusters

Delta table deletion operations intermittently return empty results when data exists

Task count performance while using serverless compute not showing consistently

FileNotFound Exception when using Apache Spark addFile on multi-node compute

Permission denied error on jobs integrated with Git repositories triggered by service principals

Contact Us