How to Create Table DDLs to Import into an External Metastore

Databricks supports using external metastores instead of the default Hive metastore. You can export all table metadata from Hive to the external metastore as follows.

  1. Use the Apache Spark Catalog API to list the tables in the databases contained in the metastore.
  2. Use the SHOW CREATE TABLE statement to generate the DDLs and store them in a file.
  3. Use the file to import the table DDLs into the external metastore.

The following code accomplishes the first two steps.

dbs = spark.catalog.listDatabases()
for db in dbs:
  f = open("your_file_name_{}.ddl".format(db), "w")
  tables = spark.catalog.listTables(db)
  for t in tables:
    DDL = spark.sql("SHOW CREATE TABLE {}".format(t)
    f.write(DDL)
    f.write("\n")
f.close()

Then, you can use the resulting file to import the table DDLs into the external metastore.