Scientific notation appears when casting DOUBLE to STRING in MERGE

Use the format_number() function or define the source and target columns using DECIMAL type.

Written by nelavelli.durganagajahnavi

Last published at: September 8th, 2025

Problem

When performing MERGE or upsert operations, you observe that large numeric values stored in DOUBLE type columns are implicitly converted to scientific (exponential) notation when merged into target columns of type STRING. This conversion results in data inconsistencies in the target tables, especially when original precision is critical (for example, the value 731220381673091 appears as 7.31220381673091E14 in the resulting table).

Cause

Casting a DOUBLE to a STRING causes the JVM to invoke the Double.toString() method. This method automatically converts large values to scientific notation for readability and efficiency, unless explicitly formatted otherwise.

Solution

Explicitly cast or format the DOUBLE value to STRING before executing any MERGE or upsert operations. You can either use format_number() or the DECIMAL type.

Use format_number()

The format_number() function rounds values and limits decimal precision to the specified number of decimal places in the second parameter (in this example, 0). Run the following query in a notebook or in the SQL editor to format the DOUBLE value as a STRING.

SELECT format_number(<column-name>, 0) AS <column-name> FROM <source-table>

Note

Use this option only if you do not need to preserve fractional precision.

Use DECIMAL Type

Define the source and target columns as DECIMAL instead of DOUBLE or STRING. Decimal types retain full precision.

Run the following SQL command either in a notebook or in the Databricks SQL editor.

ALTER TABLE <target-table> CHANGE COLUMN <column-name> DECIMAL(20, 0);

Using DECIMAL type ensures that large numeric values are stored and displayed in their full form without loss of precision.

Databricks Help Center