Problem
Databricks recognizes a set of ANSI reserved words as keywords. When you try to use reserved words as column names in your tables, you encounter unexpected behavior depending on whether or not a column with a keyword is present.
For example, you attempt to select a column 'user'
from a table after creating the table and inserting values, using the following code.
```sql
CREATE OR REPLACE TABLE <database-name>.<schema-name>.<table-name> (
user STRING,
column2 STRING
);
INSERT INTO <database-name>.<schema-name>.<table-name> (user, column2) VALUES
('dummy1', 'dummy2'),
('dummy3', 'dummy4');
SELECT user FROM <database-name>.<schema-name>.<table-name>;
```
Because the table has a 'user'
column, the query returns the values in this 'user'
column
'dummy1', 'dummy3'
If the table in the previous code did not have a 'user'
column, the query would return the current user username when you run SELECT user
.
Cause
Databricks does not specifically prohibit use of reserved words. For more information and a list of reserved words, review the Reserved words and schemas (AWS | Azure | GCP) documentation.
Solution
Set the following configurations at the session or compute level to enforce the use of ANSI SQL syntax, which prohibits the use of reserved words as column names.
```sql
SET spark.sql.ansi.enforceReservedKeywords = true;
SET spark.sql.ansi.enabled = true;
```
Any attempts to use reserved words will block the action and return the following error.
ExecutionException: org.apache.spark.sql.AnalysisException: [PARSE_SYNTAX_ERROR] Syntax error at or near '<reserved-word>'. SQLSTATE: 42601
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
For more information, refer to ANSI compliance in Databricks Runtime (AWS | Azure | GCP) documentation.