Databricks Knowledge Base

Main Navigation

  • Help Center
  • Documentation
  • Knowledge Base
  • Community
  • Training
  • Feedback

SQL with Apache Spark (GCP)

These articles can help you to use SQL with Apache Spark.

20 Articles in this category

Contact Us

If you still have questions or prefer to get help directly from an agent, please submit a request. We’ll get back to you as soon as possible.

Please enter the details of your request. A member of our support staff will respond as soon as possible.

  • Home
  • All articles
  • SQL with Apache Spark (GCP)

Date functions only accept int values in Apache Spark 3.0

Date functions only accept int values in Apache Spark 3.0; fractional and string values return AnalysisException error....

Last updated: February 28th, 2023 by Adam Pavlacka

Duplicate columns in the metadata error

Spark job fails while processing a Delta table with org.apache.spark.sql.AnalysisException Found duplicate column(s) in the metadata error....

Last updated: May 23rd, 2022 by vikas.yadav

Generate unique increasing numeric values

Use Apache Spark functions to generate unique and increasing numbers in a column in a table in a file or DataFrame....

Last updated: May 23rd, 2022 by ram.sankarasubramanian

Error in SQL statement: AnalysisException: Table or view not found

Learn how to resolve the AnalysisException SQL error "Table or view not found"....

Last updated: May 23rd, 2022 by Adam Pavlacka

Error when downloading full results after join

If you have duplicate columns after a join, you will get an error when trying to download the full results....

Last updated: May 23rd, 2022 by manjunath.swamy

Error when running MSCK REPAIR TABLE in parallel

Do not run `MSCK REPAIR` commands in parallel. It results in a read timed out or out of memory error message....

Last updated: May 23rd, 2022 by ashritha.laxminarayana

Find the size of a table

How to find the size of a table....

Last updated: May 23rd, 2022 by mathan.pillai

Inner join drops records in result

Avoid dropped records when performing an inner join....

Last updated: May 23rd, 2022 by siddharth.panchal

Data is incorrect when read from Snowflake

Data read from Snowflake is incorrect when time zone value is not set correctly....

Last updated: May 24th, 2022 by DD Sharma

JDBC write fails with a PrimaryKeyViolation error

JDBC write to a SQL database fails with a `PrimaryKeyViolation` error or results in duplicate data...

Last updated: May 24th, 2022 by harikrishnan.kunhumveettil

Query does not skip header row on external table

External Hive tables do not skip the header row when queried from Spark SQL....

Last updated: May 24th, 2022 by manisha.jena

SHOW DATABASES command returns unexpected column name

Running the `SHOW DATABASES` command returns an unexpected column name....

Last updated: May 24th, 2022 by Jose Gonzalez

Cannot view table SerDe properties

SHOW CREATE TABLE only returns the Apache Spark DDL. It does not show the SerDe properties....

Last updated: July 1st, 2022 by saritha.shivakumar

Parsing post meridiem time (PM) with to_timestamp() returns null

When converting 12-hour time to 24-hour time with to_timestamp() the hours variable must be lowercase....

Last updated: July 22nd, 2022 by chetan.kardekar

to_json() results in Cannot use null as map key error

You must filter or replace null values in your input data before using to_json()....

Last updated: July 22nd, 2022 by gopal.goel

Set nullability when using SaveAsTable with Delta tables

Learn how to create a Delta table with the nullability of columns set to false....

Last updated: October 14th, 2022 by anshuman.sahu

Ensure consistency in statistics functions between Spark 3.0 and Spark 3.1 and above

Statistics functions in Databricks Runtime 7.3 LTS and below return NaN when a divide by zero occurs. Set a Spark config to return null instead....

Last updated: October 14th, 2022 by chetan.kardekar

Using datetime values in Spark 3.0 and above

How to correctly use datetime functions in Spark SQL with Databricks runtime 7.3 LTS and above....

Last updated: October 26th, 2022 by deepak.bhutada

ANSI compliant DECIMAL precision and scale

Learn how to enable ANSI compliant error messages when incorrect values are used for DECIMAL precision and scale....

Last updated: October 29th, 2022 by saritha.shivakumar

Recreate LISTAGG functionality with Spark SQL

Use collect_list and concat_ws in Spark SQL to achieve the same functionality as LISTAGG on other platforms....

Last updated: February 24th, 2023 by manjunath.swamy


© Databricks 2022-2023. All rights reserved. Apache, Apache Spark, Spark, and the Spark logo are trademarks of the Apache Software Foundation.

Send us feedback | Privacy Notice (Updated) | Terms of Use | Your Privacy Choices | Your California Privacy Rights Privacy Rights icon

Definition by Author

0
0