Unexpected increase in S3 billing costs

Enable S3 debug logs to capture API calls made to the S3 bucket.

Written by ajay.ap

Last published at: March 3rd, 2025

Problem

You see an unexpected increase in S3 costs. How can you track down the root cause of the issue?

 

Cause

An increase in S3 costs is usually due to an increase in the rate of API calls made to your S3 bucket.

 

Solution

Check the AWS CloudTrail logs to identify the specific API call that is responsible for increased usage. 

To determine which action led to the increase in API calls, you must enable the S3 debug logs on the cluster scope. You can enable these logs by using an init script on your long-running jobs.

S3 debug log init script 

When enabled, this script sets the log level to DEBUG, which captures all the calls made to S3. These logs can help you understand what caused the sudden increase in the call rate.

Info

This generates extensive logging and may affect job performance. You should only use this init script for diagnostic and testing purposes. 

 

 

#!/bin/bash

if [[ $DB_IS_DRIVER = "TRUE" ]]; then
LOG4J_PATH="/home/ubuntu/databricks/spark/dbconf/log4j/driver/log4j2.xml"
else
LOG4J_PATH="/home/ubuntu/databricks/spark/dbconf/log4j/executor/log4j2.xml"
fi

cat << 'EOF' > /tmp/logger_config.xml
  <Logger name="com.amazonaws" level="DEBUG"/>
EOF

echo "Adjusting log4j2.xml here: $LOG4J_PATH"

sed -i '/<\/Loggers>/{
   r /tmp/logger_config.xml
   a \</Loggers>
   d
}' $LOG4J_PATH
echo "Completed log4j2 config changes at `date`"

 

For more information, review the Logging Amazon S3 API calls using AWS CloudTrail documentation.