Written by mathan.pillai

Last published at: May 23rd, 2022

This article explains how to find the size of a table.

The command used depends on if you are trying to find the size of a delta table or a non-delta table.

Size of a delta table

To find the size of a delta table, you can use a Apache Spark SQL command.


import com.databricks.sql.transaction.tahoe._
val deltaLog = DeltaLog.forTable(spark, "dbfs:/<path-to-delta-table>")
val snapshot = deltaLog.snapshot               // the current delta table snapshot
println(s"Total file size (bytes): ${deltaLog.snapshot.sizeInBytes}")

Size of a non-delta table

You can determine the size of a non-delta table by calculating the total sum of the individual files within the underlying directory.

You can also use queryExecution.analyzed.stats to return the size.


