Getting error when trying to connect to SFTP server from Databricks using passwordless authentication

Create an RSA authentication key to access a remote site from your Databricks account and preserve the private key.

Written by sravya.tanguturi

Last published at: April 28th, 2025

Problem

When you try to connect to the SFTP server from Databricks using passwordless authentication, you receive the following error.

Error Message : Host Key Verification Failed

 

Cause

Cluster restarts delete the data stored in the local disk. The private key is not preserved in the erasure, resulting in host key verification failures.

 

Solution

Create an RSA authentication key to access a remote site from your Databricks account and preserve the private key.

 

Generate an SSH key-pair 

First, create an SSH key pair inside or outside Databricks accordingly. To create the RSA key pair in Databricks, run the following command in a Databricks notebook. 

%sh
ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa

 

This command creates an RSA key pair without a passphrase at the following location.

  • Private key: ~/.ssh/id_rsa
  • Public key: ~/.ssh/id_rsa.pub

 

Preserve the public and private keys

  1. Copy or upload the generated public and private keys to a secure location, such as workspace files, cloud storage, or volume. 
  2. Create a cluster init script and attach it to automate the restoration of the SSH keys during cluster startup. This init script can be used to copy the SSH keys from your secure location to the appropriate location on the cluster.
#!/bin/bash
sleep 5
#nodes don’t have .ssh by default
mkdir -p /root/.ssh/
#copy the private key to .ssh
cp <source-path-for-private-key> /root/.ssh/id_rsa
#modify the permissions of the private key file
chmod 400 /root/.ssh/id_rsa

 

For more information on creating an init script, refer to the What are init scripts? (AWSAzureGCP) documentation.

 

Copy the public key to a remote server

Copy the public key (id_rsa.pub) to the remote server's ~/.ssh/authorized_keys file. Ensure that the permissions of the ~/.ssh folder and the authorized_keys file on the remote server are set correctly to avoid access issues.

 

Test the connection from a Databricks notebook

%sh
ssh user@remote_host

 

Or

%sh
sftp user@remote_host