How to import a custom CA certificate

Learn how to import a custom CA certificate into your Databricks cluster for Python use.

Written by arjun.kaimaparambilrajan

Last published at: August 11th, 2023

When working with Python, you may want to import a custom CA certificate to avoid connection errors to your endpoints.

ConnectionError: HTTPSConnectionPool(host='my_server_endpoint', port=443): Max retries exceeded with url: /endpoint (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fb73dc3b3d0>: Failed to establish a new connection: [Errno 110] Connection timed out',))

To import one or more custom CA certificates to your Databricks cluster:

  1. Create an init script that adds the entire CA chain and sets the REQUESTS_CA_BUNDLE property.

    In this example, PEM format CA certificates are added to the file myca.crt which is located at /user/local/share/ca-certificates/. This file is referenced in the custom-cert.sh init script.
  2. Once you created init script you can delete the certificates from the location.

    dbutils.fs.put("/databricks/init-scripts/custom-cert.sh", """#!/bin/bash
    
    cat << 'EOF' > /usr/local/share/ca-certificates/myca.crt
    -----BEGIN CERTIFICATE-----
    <CA CHAIN 1 CERTIFICATE CONTENT>
    -----END CERTIFICATE-----
    -----BEGIN CERTIFICATE-----
    <CA CHAIN 2 CERTIFICATE CONTENT>
    -----END CERTIFICATE-----
    EOF
    
    update-ca-certificates
    
    PEM_FILE="/etc/ssl/certs/myca.pem"
    PASSWORD="<password>"
    JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")
    KEYSTORE="$JAVA_HOME/lib/security/cacerts"
    
    CERTS=$(grep 'END CERTIFICATE' $PEM_FILE| wc -l)
    
    # To process multiple certs with keytool, you need to extract
    # each one from the PEM file and import it into the Java KeyStore.
    
    for N in $(seq 0 $(($CERTS - 1))); do
      ALIAS="$(basename $PEM_FILE)-$N"
      echo "Adding to keystore with alias:$ALIAS"
      cat $PEM_FILE |
        awk "n==$N { print }; /END CERTIFICATE/ { n++ }" |
        keytool -noprompt -import -trustcacerts \
                -alias $ALIAS -keystore $KEYSTORE -storepass $PASSWORD
    done
    
    echo "export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt" >> /databricks/spark/conf/spark-env.sh
    """)

    To use your custom CA certificates with DBFS FUSE (AWS | AzureGCP), add this line to the bottom of your init script: 

    /databricks/spark/scripts/restart_dbfs_fuse_daemon.sh
  3. Attach the init script to the cluster as a cluster-scoped init script (AWS | Azure | GCP).
  4. Restart the cluster.
Was this article helpful?