How to create table DDLs to import into an external metastore

Learn how to export all table metadata from Hive to an external metastore from Databricks.

Written by Adam Pavlacka

Last published at: May 16th, 2022

Databricks supports using external metastores instead of the default Hive metastore.

You can export all table metadata from Hive to the external metastore.

  1. Use the Apache Spark Catalog API to list the tables in the databases contained in the metastore.
  2. Use the SHOW CREATE TABLE statement to generate the DDLs and store them in a file.
  3. Use the file to import the table DDLs into the external metastore.

The following code accomplishes the first two steps.

%python

dbs = spark.catalog.listDatabases()
for db in dbs:
  f = open("your_file_name_{}.ddl".format(db.name), "w")
  tables = spark.catalog.listTables(db.name)
  for t in tables:
    DDL = spark.sql("SHOW CREATE TABLE {}.{}".format(db.name, t.name))
    f.write(DDL.first()[0])
    f.write("\n")
f.close()

You can use the resulting file to import the table DDLs into the external metastore.