Cannot delete data using JDBC in Eclipse

Delete operations on a Delta table fail with a SparkJDBCDriver error when using JDBC in an Eclipse local environment.

Written by annapurna.hiriyur

Last published at: May 10th, 2022

Problem

You cannot delete data from a Delta table using JDBC from your local Eclipse environment. The same delete operation works when run in a notebook.

You can connect to Databricks from your local environment.

A select query works.

Integer count = namedlocalJdbcTemplate.queryForObject("SELECT COUNT(*) FROM <table-name> ", new MapSqlParameterSource(), Integer.class);

You attempt a delete operation and it fails.

namedlocalJdbcTemplate.update("DELETE FROM <table-name> WHERE id = :Id", parameters);

where parameter is
MapSqlParameterSource params = new MapSqlParameterSource();
params.addValue("Id", “001”);

You receive a SparkJDBCDriver error message.

org.springframework.jdbc.UncategorizedSQLException: PreparedStatementCallback; uncategorized SQLException for SQL [DELETE FROM <table-name> WHERE Id = ?]; SQL state [HY000]; error code [500051]; [Simba][SparkJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: Error running query: org.apache.spark.sql.AnalysisException: cannot resolve '`<table-name>.Id`' given input columns: []; line 1 pos 41;
'DeleteCommandEdge Delta[version=0, s3://<path-to-database>], ('<table-name>.Id = 001)

Cause

This error can occur when UseNativeQuery has a value of 0.

The ODBC driver parses the query and does some rewriting when UseNativeQuery has a value of 0. Some tools, like PowerBI, depend on this translation. For example, PowerBI generates standard ANSI SQL which needs translation to Spark SQL. It has the potential to add unnecessary overheard if the application already generates Spark SQL.

Solution

Set UseNativeQuery to a value of 1 in the JDBC connection string in your IDE. This forces the ODBC driver to pass the submitted query without rewriting it.

Delete

Info

UseNativeQuery=0 takes the submitted query as ANSI SQL-92 and rewrites it into Spark SQL. UseNativeQuery=1 takes the submitted query verbatim and submits it as-is.

String CONNECTION_URL = "jdbc:spark://<server-hostname>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/0/xxxx-xxxxxx-xxxxxxxx;AuthMech=3;UID=token;PWD=<personal-access-token>;UseNativeQuery=1"
Connection connection = null;
connection = DriverManager.getConnection(CONNECTION_URL);

Review the Initializing the Driver Class Simba documentation for more information on establishing a JDBC connection.

After you set UseNativeQuery to a value of 1, you must rewrite your insert query so that it does not specify column names.

For example, INSERT INTO <table-name> VALUES (<?>,<?>,<?>,<?>,<?>,<?>,<?>,<?>).

This is because Spark SQL does not natively support column names when using the insert command.