Listing Table Names

This article explains why spark.catalog.listTables() and %sql show tables have different performance characteristics.

Problem

To fetch all the table names from metastore you can use either spark.catalog.listTables() or %sql show tables. If you observe the duration to fetch the details you can see spark.catalog.listTables() usually takes longer than %sql show tables.

Cause

spark.catalog.listTables() tries to fetch every table’s metadata first and then show the requested table names. This process is slow when dealing with complex schemas and larger numbers of tables.

Solution

To get only the table names, use %sql show tables which internally invokes SessionCatalog.listTables which fetches only the table names.