to_json() results in Cannot use null as map key error

You must filter or replace null values in your input data before using to_json().

Written by gopal.goel

Last published at: July 22nd, 2022

Problem

You are using to_json() to convert data to JSON and you get a Cannot use null as map key error:

RuntimeException: Cannot use null as map key. 

Cause

to_json() function does not support using null values as the input map keys.

This example code causes the Cannot use null as map key error when run, because of the null value used as a map key in the fourth line.

%sql

select
  to_json(
    map(
      1, 'Databricks',
      2, 'Map',
      3, 'Error',
      null, 'Data'
    )
  ) as json;

Solution 

You should filter out any null values present in the input data before running to_json(), or use nvl() to replace all of the null values with non-null values.

Filter null values

Consider this example DataFrame:

+---+----------+-------+
| Id|     Value|address|
+---+----------+-------+
|  1|Databricks|   null|
|  2|       Map|   null|
|  3|     Error|    xyz|
+---+----------+-------+

There are two null values in the example.

Attempting to use to_json() on this DataFrame will return an error.

We can filter the null data by showing only the rows that have non-null values.

For example, filtering with df.filter("address is not null").show() returns:

+---+-----+-------+
| Id|Value|address|
+---+-----+-------+
|  3|Error|    xyz|
+---+-----+-------+

This filtered DataFrame does not contain any null values, so it can now be used as an input with to_json().

Replace null values with replacements

If you cannot filter out the null values, you can use nvl() to replace the null values with non-null values. 

The sample code originally had a null value as the map key for the fourth line. Since that results in an error, nvl() is used in this updated example to substitute 4 for the null value.

%sql

select
  to_json(
    map(
      1, 'Databricks',
      2, 'Map',
      3, 'Error',
 nvl(null, 4), 'Data'
    )
  ) as JSON;