Data management (GCP)

Append to a DataFrame

Learn how to append to a DataFrame in Databricks....

Last updated: September 28th, 2022 by Adam Pavlacka

How to improve performance with bucketing

Learn how to improve Databricks performance by using bucketing....

Last updated: February 29th, 2024 by Adam Pavlacka

How to handle blob data contained in an XML file

Learn how to handle blob data contained in an XML file....

Last updated: March 4th, 2022 by Adam Pavlacka

Simplify chained transformations

Learn how to simplify chained transformations on your DataFrame in Databricks....

Last updated: May 25th, 2022 by Adam Pavlacka

Hive UDFs

Learn how to create and use a Hive UDF for Databricks....

Last updated: May 31st, 2022 by Adam Pavlacka

Prevent duplicated columns when joining two DataFrames

Learn how to prevent duplicated columns when joining two DataFrames in Databricks....

Last updated: October 13th, 2022 by Adam Pavlacka

Revoke all user privileges

Use a regex and a series of for loops to revoke all privileges for a single user....

Last updated: May 31st, 2022 by pavan.kumarchalamcharla

How to handle corrupted Parquet files with different schema

Learn how to read Parquet files with a specific schema using Databricks....

Last updated: May 31st, 2022 by Adam Pavlacka

No USAGE permission on database

User does not have USAGE permission on the database....

Last updated: May 31st, 2022 by rakesh.parija

Nulls and empty strings in a partitioned column save as nulls

Learn why nulls and empty strings in a partitioned column save as nulls in Databricks....

Last updated: May 31st, 2022 by Adam Pavlacka

Behavior of the randomSplit method

Learn about inconsistent behaviors when using the randomSplit method in Databricks....

Last updated: May 31st, 2022 by Adam Pavlacka

Generate schema from case class

Learn how to generate a schema from a Scala case class....

Last updated: May 31st, 2022 by Adam Pavlacka

How to specify skew hints in dataset and DataFrame-based join commands

Learn how to specify skew hints in Dataset and DataFrame-based join commands in Databricks....

Last updated: May 31st, 2022 by Adam Pavlacka

How to update nested columns

Learn how to update nested columns in Databricks....

Last updated: May 31st, 2022 by Adam Pavlacka

Incompatible schema in some files

Learn how to resolve incompatible schema in Parquet files with Databricks....

Last updated: May 31st, 2022 by Adam Pavlacka

Unable to infer schema for ORC error

Apache Spark returns an error for ORC files if no schema is defined when reading from an empty directory or a base path with multiple subfolders....

Last updated: December 1st, 2022 by chandana.koppal

User does not have permission SELECT on ANY File

Regular users cannot create tables without permission when access control is enabled....

Last updated: July 14th, 2025 by sivaprasad.cs

Sync fails with [UPGRADE_NOT_SUPPORTED.HIVE_SERDE] Table is not eligible for upgrade from Hive Metastore to Unity Catalog

Convert your Hive SerDE tables to Delta format. ...

Last updated: September 12th, 2024 by akash.bhat

Parquet table counts not being reflected based on concurrent updates

Manually refresh the table in the notebook where the count was initially taken....

Last updated: September 12th, 2024 by ram.sankarasubramanian

Empty string values convert to NULL values when saving a table as CSV or text-based file format

Use Delta as the target format for CSV files or other text-based data formats....

Last updated: September 12th, 2024 by caio.cominato

'CREATE OR REPLACE' SQL error in a Delta table

Correct the job schedule to ensure that only one query is executed at a time for a specific table....

Last updated: September 23rd, 2024 by lakshay.goel

Increased wait times between micro-batches in Auto Loader

Use file notification mode instead of the directory listing method....

Last updated: September 10th, 2024 by lakshay.goel

Addressing performance issues with over-partitioned Delta tables

Implement liquid clustering for improved performance....

Last updated: October 16th, 2024 by raphael.balogo

Count of corrupt_records returns zero in serverless

Collect records in an array and get the count of the array instead....

Last updated: December 11th, 2024 by lakshay.goel

Error INVALID_TEMP_OBJ_REFERENCE when trying to create a view

Persist the temporary object to a location, then create your view....

Last updated: January 16th, 2025 by lucas.rocha

Error when trying to parse XML in a shared mode cluster using the from_xml()function

Define an XML schema in a Data Definition Language (DDL) string first. ...

Last updated: January 17th, 2025 by Raghavan Vaidhyaraman

Error when trying to create a distributed Ray dataset using from_spark() function

Set spark.databricks.pyspark.dataFrameChunk.enabled to true....

Last updated: January 30th, 2025 by Raghavan Vaidhyaraman

INVALID_PARAMETER_VALUE error when trying to access a table or view with fine-grained access control

Upgrade the cluster's Databricks Runtime version to a version newer than 15.4 and use a single user access mode cluster....

Last updated: January 30th, 2025 by raphael.balogo

Other users can see root (main) folder despite not having access permissions

Permanently delete items in the Trash and create separate folders for shared files or objects. ...

Last updated: February 12th, 2025 by monica.cao

Execution error when trying to mount a storage account

Unmount the nested path and mount each storage account separately at distinct mount points....

Last updated: March 18th, 2025 by guruprasad.bn

Resolve Spark directory structure conflicts

Fixing java.lang.AssertionError: assertion failed: Conflicting directory structures detected...

Last updated: April 26th, 2025 by jayant.sharma

Column values assigning in the order they are passed into Row() as arguments, not to the column name indicated

Create the DataFrame from a list of dictionaries or use the row.toDict() method....

Last updated: April 28th, 2025 by Raghavan Vaidhyaraman

Insufficient privileges error when querying views in a Unity Catalog metastore

Conduct a traceback of the permission tree for the view table and grant access where needed....

Last updated: April 30th, 2025 by zhengxian.huang

Getting error com.univocity.parsers.common.TextParsingException when parsing data

Make sure the delimiter used in your read operation matches the format of your input files....

Last updated: June 10th, 2025 by Vidhi Khaitan

ALTER TABLE command only reordering on metadata level

Use CREATE OR REPLACE TABLE or INSERT OVERWRITE TABLE instead....

Last updated: June 10th, 2025 by Vidhi Khaitan

Error when creating a DataFrame with values of different data types

Set spark.sql.pyspark.inferNestedDictAsStruct.enabled in the same notebook in a preceding cell....

Last updated: June 18th, 2025 by Raghavan Vaidhyaraman

Databricks Help Center

Append to a DataFrame

How to improve performance with bucketing

How to handle blob data contained in an XML file

Simplify chained transformations

Hive UDFs

Prevent duplicated columns when joining two DataFrames

Revoke all user privileges

How to handle corrupted Parquet files with different schema

No USAGE permission on database

Nulls and empty strings in a partitioned column save as nulls

Behavior of the randomSplit method

Generate schema from case class

How to specify skew hints in dataset and DataFrame-based join commands

How to update nested columns

Incompatible schema in some files

Unable to infer schema for ORC error

User does not have permission SELECT on ANY File

Sync fails with [UPGRADE_NOT_SUPPORTED.HIVE_SERDE] Table is not eligible for upgrade from Hive Metastore to Unity Catalog

Parquet table counts not being reflected based on concurrent updates

Empty string values convert to NULL values when saving a table as CSV or text-based file format

'CREATE OR REPLACE' SQL error in a Delta table

Increased wait times between micro-batches in Auto Loader

Addressing performance issues with over-partitioned Delta tables

Count of corrupt_records returns zero in serverless

Error INVALID_TEMP_OBJ_REFERENCE when trying to create a view

Error when trying to parse XML in a shared mode cluster using the from_xml()function

Error when trying to create a distributed Ray dataset using from_spark() function

INVALID_PARAMETER_VALUE error when trying to access a table or view with fine-grained access control

Other users can see root (main) folder despite not having access permissions

Execution error when trying to mount a storage account

Resolve Spark directory structure conflicts

Column values assigning in the order they are passed into Row() as arguments, not to the column name indicated

Insufficient privileges error when querying views in a Unity Catalog metastore

Getting error com.univocity.parsers.common.TextParsingException when parsing data

ALTER TABLE command only reordering on metadata level

Error when creating a DataFrame with values of different data types

Contact Us