Databricks Help Center

Main Navigation

  • Help Center
  • Documentation
  • Knowledge Base
  • Community
  • Training
  • Feedback

Data sources

These articles can help you manage your data source integrations.

41 Articles in this category

  • Home
  • All articles
  • Data sources

Create tables on JSON datasets

Create tables on JSON datasets; requires SerDe JAR....

Last updated: May 31st, 2022 by ram.sankarasubramanian

Delete table when underlying S3 bucket is deleted

Do not delete the contents of a S3 bucket before dropping a table that stores data in the bucket....

Last updated: May 31st, 2022 by Jose Gonzalez

Failure when mounting or accessing Azure Blob storage

Learn how to resolve a failure when mounting or accessing Azure Blob storage from Databricks....

Last updated: May 31st, 2022 by Adam Pavlacka

Unable to read files and list directories in a WASB filesystem

Learn how to interpret errors that occur when accessing WASB append blob types in Databricks....

Last updated: June 1st, 2022 by Adam Pavlacka

Optimize read performance from JDBC data sources

Learn how to optimize performance when reading from JDBC data sources in Databricks....

Last updated: June 1st, 2022 by Adam Pavlacka

Troubleshooting JDBC/ODBC access to Azure Data Lake Storage Gen2

Learn how to troubleshoot JDBC and ODBC access to Azure Data Lake Storage Gen2 from Databricks....

Last updated: June 1st, 2022 by Adam Pavlacka

CosmosDB-Spark connector library conflict

Learn how to resolve conflicts that arise when using the CosmosDB-Spark connector library with Databricks....

Last updated: June 1st, 2022 by Adam Pavlacka

Failure to detect encoding in JSON

Learn how to resolve a failure to detect encoding of input JSON files when using BOM with Databricks....

Last updated: June 1st, 2022 by Adam Pavlacka

Inconsistent timestamp results with JDBC applications

Timestamp records are inconsistent with JDBC applications when daylight saving time adjustments are made....

Last updated: June 1st, 2022 by manjunath.swamy

Kafka client terminated with OffsetOutOfRangeException

Kafka client is terminated with `OffsetOutOfRangeException` when trying to fetch messages...

Last updated: June 1st, 2022 by vikas.yadav

Apache Spark JDBC datasource query option doesn’t work for Oracle database

Learn how to resolve an error that occurs when using the Apache Spark JDBC datasource to connect to Oracle Database from Databricks....

Last updated: June 1st, 2022 by Adam Pavlacka

Accessing Redshift fails with NullPointerException

Learn how to resolve a `NullPointerException` error that occurs when you read a Redshift table....

Last updated: June 1st, 2022 by Adam Pavlacka

Redshift JDBC driver conflict issue

Learn how to resolve a Redshift JDBC SQLDriverWrapper driver conflict....

Last updated: June 1st, 2022 by Adam Pavlacka

ABFS client hangs if incorrect client ID or wrong path used

Trying to access an Azure Blob File System (ABFS) path results in a hung command when using Azure Data Lake Storage Gen2 (ADLS)....

Last updated: June 1st, 2022 by Adam Pavlacka

Reading a table fails due to AAD token timeout on ADLS Gen2

Accessing ADLS Gen2 storage fails if the AAD service principal token is expired or invalid....

Last updated: November 30th, 2022 by John.Lourdu

Recursive references in Avro schema are not allowed

Apache Avro data sources cannot have recursive references in the schema when used with Spark....

Last updated: February 19th, 2025 by saikrishna.pujari

Error when reading data from ADLS Gen1 with Sparklyr

Learn how to resolve errors that occur when reading data from Azure Data Lake Storage Gen1 with Sparklyr in Databricks....

Last updated: December 9th, 2022 by Adam Pavlacka

Long jobs fail when accessing ADLS

Long running jobs that use Azure AD credential passthrough to access ADLS fail after 1 hour....

Last updated: December 9th, 2022 by huaming.liu

ADLS and WASB writes are being throttled

Learn how to resolve a "files and folders are being created at too high a rate" ADLS or WASB storage error....

Last updated: December 9th, 2022 by Adam Pavlacka

Unable to access Azure Data Lake Storage (ADLS) Gen1 when firewall is enabled

Learn how to troubleshoot access issues when connecting to Azure Data Lake Storage Gen 1 from Databricks with a firewall enabled....

Last updated: December 9th, 2022 by Adam Pavlacka

SQL access control error when using Snowflake as a data source

Snowflake does not officially support schema as an option; you must use sfschema....

Last updated: January 20th, 2023 by John.Lourdu

Apache Spark reading .gzip files from S3 instead of decompressed data

Rename the files in S3 from .gzip to .gz....

Last updated: September 12th, 2024 by kuldeep.mishra

Column drift when reading multiple delimited files

Ensure that all files being processed together have the same schema....

Last updated: September 23rd, 2024 by lakshay.goel

NullPointerException when reading shapefiles from cloud storage on a Mosaic and GDAL enabled cluster

Zip the entire shapefile and upload to a Unity Catalog volume or DBFS storage. ...

Last updated: September 12th, 2024 by jessica.santos

MULTIPLE_XML_DATA_SOURCE error while working with XML data

Remove the external XML library from the cluster. ...

Last updated: August 30th, 2024 by kaushal.vachhani

Databricks jobs using AWS Glue Data Catalog failing due to inability to reach cluster drivers

Ensure the Databricks cluster's IAM role has necessary permissions to access AWS Glue Data Catalog, update the IAM policy, and restart the cluster....

Last updated: October 15th, 2024 by raphael.balogo

ALTER TABLE (drop partition) error in Unity Catalog external tables

For CSV, JSON, ORC, or data formats, use partition metadata logging. ...

Last updated: October 15th, 2024 by lakshay.goel

“java.lang.IllegalStateException: Unexpected type: JSON” error when creating an external table from BigQuery

Upgrade your cluster to Databricks Runtime 14.0 or above. ...

Last updated: October 22nd, 2024 by jessica.santos

Schema mismatch issue while reading parquet files

Fix the file schema or read the files separately. ...

Last updated: October 23rd, 2024 by lakshay.goel

Reading a CSV file in DROPMALFORMED still includes malformed rows in the result

...

Last updated: November 7th, 2024 by shubham.bhusate

Cannot see ingested data loaded from an external ORC table

Use the same Hive interface to ingest and read your Delta table. ...

Last updated: November 17th, 2024 by lakshay.goel

Security Bulletin: Databricks JDBC Driver Vulnerability Advisory - [CVE-2024-49194]

Restart any long running clusters and update your JDBC driver to the latest version....

Last updated: December 11th, 2024 by Adam Pavlacka

Error when creating a Delta table using the UI and external data in Delta format

Create the table using a notebook instead. ...

Last updated: December 13th, 2024 by manikandan.ganesan

Error when trying to access Azure storage account from China region

Rule out common Apache Spark configuration issues and ensure your Spark configuration for your OAuth endpoint is set to the China region....

Last updated: January 22nd, 2025 by saikumar.divvela

KeyProviderException error when trying to create an external table on an external schema with authentication at the notebook level

Set up authorization at the cluster configuration level instead....

Last updated: January 31st, 2025 by Ernesto Calderón

Table or view not found error when trying to query a federated table using SQL serverless compute

Explicitly set up private connectivity for serverless....

Last updated: February 4th, 2025 by alberto.umana

Oracle Federation failing to find a data source

Upgrade compute runtime to 16.1 or use Pro SQL warehouse 2024.50...

Last updated: March 12th, 2025 by alberto.umana

Using LIKE statement causing slower performance in Lakehouse Federation query

Replace the LIKE statement in your query with filter options that can be passed as pushdown filters....

Last updated: March 19th, 2025 by allan.soares

Total size of serialized results of tasks is larger than spark.driver.maxResultSize when using ODBC connection

Enable Cloud Fetch to retrieve larger datasets....

Last updated: March 19th, 2025 by Lucas Ribeiro

Multiple identical files being written to badRecordsPath instead of just one file when writing code to read a CSV file as a DataFrame

Use .option(“mode”, “PERMISSIVE”) instead....

Last updated: March 27th, 2025 by Vidhi Khaitan

Getting error when trying to connect to SFTP server from Databricks using passwordless authentication

Create an RSA authentication key to access a remote site from your Databricks account and preserve the private key....

Last updated: April 28th, 2025 by sravya.tanguturi

Contact Us

If you still have questions or prefer to get help directly from an agent, please submit a request. We’ll get back to you as soon as possible.

Please enter the details of your request. A member of our support staff will respond as soon as possible.


© Databricks 2022-2025. All rights reserved. Apache, Apache Spark, Spark, and the Spark logo are trademarks of the Apache Software Foundation.

Send us feedback | Privacy Notice (Updated) | Terms of Use | Your Privacy Choices | Your California Privacy Rights Privacy Rights icon


Knowledge Base Software powered by Helpjuice

Definition by Author

0
0