Redshift JDBC driver conflict issue

Problem

If you attach multiple Redshift JDBC drivers to a cluster, and use the Redshift connector, the notebook REPL might hang or crash with a SQLDriverWrapper error message.

19/11/14 01:01:44 ERROR SQLDriverWrapper: Fatal non-user error thrown in ReplId-9d455-9b970-b2042
java.lang.NoSuchFieldError: PG_SUBPROTOCOL_NAMES
        at com.amazon.redshift.jdbc.Driver.getSubProtocols(Unknown Source)
        at com.amazon.redshift.jdbc.Driver.acceptsSubProtocol(Unknown Source)
        at com.amazon.jdbc.common.BaseConnectionFactory.acceptsURL(Unknown Source)
        at com.amazon.jdbc.common.AbstractDriver.connect(Unknown Source)
        at org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.connect(DriverWrapper.scala:45)
        at com.databricks.spark.redshift.JDBCWrapper.getConnector(RedshiftJDBCWrapper.scala:355)
        at com.databricks.spark.redshift.JDBCWrapper.getConnector(RedshiftJDBCWrapper.scala:376)
        at com.databricks.spark.redshift.RedshiftRelation$$anonfun$schema$1.apply(RedshiftRelation.scala:75)
        at com.databricks.spark.redshift.RedshiftRelation$$anonfun$schema$1.apply(RedshiftRelation.scala:72)

Cause

Databricks Runtime does not include a Redshift JDBC driver. If you are using Redshift, you must attach the correct driver to your cluster. If you attach multiple Redshift JDBC drivers to a single cluster they may be incompatible, which results in a hang or a crash.

For example, the following Redshift JDBC jars are incompatible:

  • RedshiftJDBC41-1.1.7.1007.jar
  • RedshiftJDBC42-no-awssdk-1.2.20.1043.jar

If you attach both of these to the same cluster, the SQLDriverWrapper error message will appear when you try to access Redshift.

Solution

You should only have one Redshift JDBC driver attached to a cluster. Review the Redshift JDBC Driver documentation to choose the correct driver for your cluster.