Databricks includes a number of default Java and Scala libraries. You can replace any of these libraries with another version by using a cluster-scoped init script to remove the default library jar and then install the version you require.
Removing default libraries and installing new versions may cause instability or completely break your Databricks cluster. You should thoroughly test any new library version in your environment before running production jobs.
To identify the name of the jar file you want to remove:
- Click the Databricks Runtime version you are using from the list of supported releases.
- Navigate to the Java and Scala libraries section.
- Identify the Artifact ID for the library you want to remove.
ls -l command in a notebook to find the jar that contains the artifact id. For example, to find the jar filename for the
spark-snowflake_2.12 artifact id in Databricks Runtime 7.0 you can use the following code:
%sh ls -l /databricks/jars/*spark-snowflake_2.12*
This returns the jar filename
Use the following template to create a cluster-scoped init script.
#!/bin/bash rm -rf /databricks/jars/<jar_filename_to_remove>.jar cp /dbfs/<path_to_replacement_jar>/<replacement_jar_filename>.jar /databricks/jars/
spark-snowflake_2.12 example from the prior step would result in an init script similar to the following:
#!/bin/bash rm -rf /databricks/jars/----workspace_spark_3_0--maven-trees--hive-2.3__hadoop-2.7--net.snowflake--spark-snowflake_2.12--net.snowflake__spark-snowflake_2.12__2.5.9-spark_2.4.jar cp /dbfs/FileStore/jars/e43fe9db_c48d_412b_b142_cdde10250800-spark_snowflake_2_11_2_7_1_spark_2_4-b2adc.jar /databricks/jars/
- Install the cluster-scoped init script on the cluster, following the instructions in Configure a cluster-scoped init script.
- Restart the cluster.