How to create an init script to collect tcp_dumps

Edit the cluster configuration to attach the tcp dump init script and collect network packet traces.

Written by saikumar.divvela

Last published at: July 8th, 2025

How to collect tcp_dumps

Create a path and the init script, add the init script to an allowlist, configure the init script, then locate and download the pcap files. 

 

Create a volume or workspace and then the tcp_dump init script

  1. Create a volume or workspace and provide the path to store the init script.
  2. Run the following sample init script in a notebook to collect tcp_dumps. This script uses curl to make a DBFS PUT API call to the workspace to upload the pcap files. You need to pass the token and the workspace host in the API path in the script. Your DAPI token should be an existing valid PAT (string) that has permission to use DBFS PUT API.

If you want to filter the tcp_dumps with host and port, you can uncomment the TCPDUMP_FILTER line in the script and add the required host and port. Alternatively, you can pass the host and port separately, depending on your requirements.

 

The following code references a volume path. For workspace paths, change Volumes to Workspace.

dbutils.fs.put("/Volumes/<path-to-init-script>/tcp_dumps.sh", """
#!/bin/bash

set -euxo pipefail

MYIP=$(echo $HOSTNAME)
TMP_DIR="/local_disk0/tmp/tcpdump"

[[ ! -d ${TMP_DIR} ]] && mkdir -p ${TMP_DIR}
TCPDUMP_WRITER="-w ${TMP_DIR}/trace_%Y%m%d_%H%M%S_${DB_CLUSTER_ID}_${MYIP}.pcap -W 1000 -G 900 -Z root -U -s256"
TCPDUMP_PARAMS="-nvv -K"
#TCPDUMP_FILTER="host xxxxxxxxx.dfs.core.windows.net and port 443" ## add host/port filter here based on the requirement

sudo tcpdump $(echo "${TCPDUMP_WRITER}") $(echo "${TCPDUMP_PARAMS}") $(echo "${TCPDUMP_FILTER}") &
echo "Started tcpdump $(echo "${TCPDUMP_WRITER}") $(echo "${TCPDUMP_PARAMS}") $(echo "${TCPDUMP_FILTER}")"

cat > /tmp/copy_stats.sh << 'EOF'
#!/bin/bash

TMP_DIR=$1
DB_CLUSTER_ID=$2
COPY_INTERVAL_IN_SEC=45
MYIP=$(echo $HOSTNAME)
echo "Starting copy script at `date`"

DEST_DIR="/Volumes/main/default/jar/"
#mkdir -p ${DEST_DIR}

sleep_duration=45

log_file="/tmp/copy_stats.log"
touch $log_file

declare -gA file_sizes

## logic to copy  files by checking previous size. Uses associative array to persist rotated files size.

while true; do
 sleep ${COPY_INTERVAL_IN_SEC}
 #ls -ltr ${DEST_DIR} >  $log_file
 for file in $(find "$TMP_DIR" -type f -mmin -3 ); do
   current_size=$(stat -c "%s" "$file")
   file_name=$(basename "$file")
   last_size=${file_sizes["$file_name"]}
   if [ "$current_size" != "$last_size" ]; then
       echo "Copying $file with current size: $current_size and last size: $last_size at `date`" | tee -a $log_file
       DBFS_PATH="dbfs:/FileStore/tcpdumpfolder/${DB_CLUSTER_ID}/trace_$(date +"%Y-%m-%d--%H-%M-%S")_${DB_CLUSTER_ID}_${MYIP}.pcap"
       curl -vvv -F contents=@$file -F path="$DBFS_PATH" -H "Authorization: Bearer <your-dapi-token>" https://<your-databricks-workspace-url>/api/2.0/dbfs/put  2>&1 | tee -a $log_file
       #cp --verbose "$file" "$DEST_DIR" | tee -a $log_file
       echo "done Copying $file with current size: $current_size at `date`" | tee -a $log_file

       file_sizes[$file_name]=$current_size
   else
       echo "Skip Copying $file with current size: $current_size and last size: $last_size at `date`" | tee -a $log_file
   fi
 done
 done

EOF

chmod a+x /tmp/copy_stats.sh
/tmp/copy_stats.sh $TMP_DIR $DB_CLUSTER_ID & disown              
""", True)

 

Note the path to the init script. You will need it when configuring your cluster.

 

Add the init script to the allowlist

Follow the instructions to add the init script to the allowlist in the Allowlist libraries and init scripts on compute with standard access mode (formerly shared access mode) (AWSAzureGCP) documentation.

 

Configure the init script

  1. Follow the instructions to configure a cluster-scoped init script in the Cluster-scoped init scripts (AWSAzureGCP) documentation.
  2. Specify the path to the init script. Use the same path that you used in the preceding script. (/Volumes/<path-to-init-script>/tcp_dump.sh or /Workspace/<path-to-init-script>/tcp_dump.sh)
  3. After configuring the init script, restart the cluster.

 

Locate the pcap files

Once the cluster has started, it automatically starts creating pcap files containing the recorded network information. Locate the pcap files in the folder dbfs:/FileStore/tcpdumpfolder/${DB_CLUSTER_ID}.

 

Download the pcap files

Download the pcap files from the DBFS path to your local host for analysis. There are multiple ways to download files to your local machine. One option is the Databricks CLI. For more information, review the What is the Databricks CLI? (AWSAzureGCP) documentation.