Unresolved column error when using Apache Spark Connect to run a query to create a temporary view

Use unique names for each temporary view.

Written by raul.goncalves

Last published at: March 19th, 2025

Problem

When creating a new temporary view using Apache Spark Connect you encounter an issue. 

[UNRESOLVED_COLUMN.WITH_SUGGESTION] A column, variable, or function parameter with name `col1` cannot be resolved. Did you mean one of the following? [`test_col`]. SQLSTATE: 42703

 

This error happens even when you know that the column exists and can be resolved.

 

Example code

In the following code, the view is defined and then redefined, based on the underlying query. 

df = spark.sql("select 'test' as col1")

#create the temporary view
df.createOrReplaceTempView('temp_view')

#use the temporary view and saving with the same name
df = spark.sql("select col1 as test_col from temp_view")
df.createOrReplaceTempView('temp_view')

df.count()

 

Cause

Temporary views in Spark Connect are lazily analyzed, which means that if there is a change to the temporary view, the change is not validated until the temporary view is called. 

 

Upon being called, the temporary view is evaluated and updated. In this case, as the temporary view was recreated, it does not have reference of previous versions of the temporary view, including columns previously defined. This results in the unresolved column error. 

 

Solution

When working with temporary views, use unique names for each temporary view. 

 

If possible, consider using DataFrames instead of temporary views. For more information, refer to the Tutorial: Load and transform data using Apache Spark DataFrames (AWSAzureGCP) documentation.