Access S3 with temporary session credentials

Extract IAM session credentials and use them to access S3 storage via S3A URI. Requires Databricks Runtime 8.3 and above.

Written by Gobinath.Viswanathan

Last published at: May 16th, 2022

You can use IAM session tokens with Hadoop config support to access S3 storage in Databricks Runtime 8.3 and above.



You cannot mount the S3 path as a DBFS mount when using session credentials. You must use the S3A URI.

Extract the session credentials from your cluster

Extract the session credentials from your cluster.

You will need the Instance Profile from your cluster. This can be found under Advanced Options in the cluster configuration.

Use curl to display the AccessKeyId, SecretAccessKey, and Token.



Alternatively, you can use a Python script.


import requests
import json
response = requests.get("<instance-profile>")
credentials = response.json()

The IP address should not be modified. is a link-local address and is valid only from the instance.



You can only extract a session token from a standard cluster. This will not work on a high concurrency cluster.

Use session credentials in a notebook

You can use the session credentials by entering them into a notebook.


AccessKey = "<AccessKeyId>"
Secret = "<SecretAccessKey>"
Token = "<Token>"
sc._jsc.hadoopConfiguration().set("", "org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider")
sc._jsc.hadoopConfiguration().set("fs.s3a.access.key", AccessKey )
sc._jsc.hadoopConfiguration().set("fs.s3a.secret.key", Secret)
sc._jsc.hadoopConfiguration().set("fs.s3a.session.token", Token)

Once the session credentials are loaded in the notebook, you can access files in the S3 bucket with a S3A URI.


Use session credentials in the cluster config

You can add the session credentials to the cluster Spark config. This makes them accessible to all notebooks on the cluster. org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
fs.s3a.access.key <AccessKeyId>
fs.s3a.secret.key <SecretAccessKey>
fs.s3a.session.token <Token>