GeoSpark undefined function error with DBConnect


You are trying to use the GeoSpark function st_geofromwkt with DBConnect and you get an Apache Spark error message.

Error: org.apache.spark.sql.AnalysisException: Undefined function: 'st_geomfromwkt'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.;

This example code fails with the error when used with DBConnect.

val sc = spark.sparkContext

val sqlContext = spark.sqlContext


spark.sql("select ST_GeomFromWKT(area) AS geometry from polygon").show()


DBConnect does not support auto-sync of client side UDFs to the server.


You can use a custom utility jar with code that registers the UDF on the cluster using the SparkSessionExtensions class.

  1. Create a utility jar that registers GeoSpark functions using SparkSessionExtensions. This utility class definition can be built into a utility jar.

    package com.databricks.spark.utils
    import org.apache.spark.sql.SparkSessionExtensions
    import org.datasyslab.geosparksql.utils.GeoSparkSQLRegistrator
    class GeoSparkUdfExtension extends (SparkSessionExtensions => Unit) {
      def apply(e: SparkSessionExtensions): Unit = {
        e.injectCheckRule(spark => {
          println("INJECTING UDF")
          _ => Unit
  2. Copy the GeoSpark jars and your utility jar to DBFS at dbfs:/databricks/geospark-extension-jars/.

  3. Create an init script ( that copies the jars from the DBFS location to the Spark class path and sets the spark.sql.extensions to the utility class.

          |sleep 10s
          |# Copy the extension and GeoSpark dependency jars to /databricks/jars.
          |cp -v /dbfs/databricks/geospark-extension-jars/{spark_geospark_extension_2_11_0_1.jar,geospark_sql_2_3_1_2_0.jar,geospark_1_2_0.jar} /databricks/jars/
          |# Set the extension.
          |cat << 'EOF' > /databricks/driver/conf/00-custom-spark.conf
          |[driver] {
          |    "spark.sql.extensions" = "com.databricks.spark.utils.GeoSparkUdfExtension"
        overwrite = true
  4. Install the init script as a cluster-scoped init script. You will need the full path to the location of the script (dbfs:/databricks/<init-script-folder>/

  5. Reboot your cluster.

  6. You can now use GeoSpark code with DBConnect.