Replace a default library jar
Databricks includes a number of default Java and Scala libraries. You can replace any of these libraries with another version by using a cluster-scoped init script to remove the default library jar and then install the version you require.
Important
Removing default libraries and installing new versions may cause instability or completely break your Databricks cluster. You should thoroughly test any new library version in your environment before running production jobs.
Identify the artifact id
To identify the name of the jar file you want to remove:
- Click the Databricks Runtime version you are using from the list of supported releases.
- Navigate to the Java and Scala libraries section.
- Identify the Artifact ID for the library you want to remove.
Use the artifact id to find the jar filename
Use the ls -l
command in a notebook to find the jar that contains the artifact id. For example, to find the jar filename for the spark-snowflake_2.12
artifact id in Databricks Runtime 7.0 you can use the following code:
%sh
ls -l /databricks/jars/*spark-snowflake_2.12*
This returns the jar filename
`----workspace_spark_3_0--maven-trees--hive-2.3__hadoop-2.7--net.snowflake--spark-snowflake_2.12--net.snowflake__spark-snowflake_2.12__2.5.9-spark_2.4.jar`.
Create the init script
Use the following template to create a cluster-scoped init script.
#!/bin/bash
rm -rf /databricks/jars/<jar_filename_to_remove>.jar
cp /dbfs/<path_to_replacement_jar>/<replacement_jar_filename>.jar /databricks/jars/
Using the spark-snowflake_2.12
example from the prior step would result in an init script similar to the following:
#!/bin/bash
rm -rf /databricks/jars/----workspace_spark_3_0--maven-trees--hive-2.3__hadoop-2.7--net.snowflake--spark-snowflake_2.12--net.snowflake__spark-snowflake_2.12__2.5.9-spark_2.4.jar
cp /dbfs/FileStore/jars/e43fe9db_c48d_412b_b142_cdde10250800-spark_snowflake_2_11_2_7_1_spark_2_4-b2adc.jar /databricks/jars/
Install the init script and restart
- Install the cluster-scoped init script on the cluster, following the instructions in Configure a cluster-scoped init script.
- Restart the cluster.