Problem
When using Lakehouse Federation to query a dataset in BigQuery, you want to retrieve metadata from BigQuery INFORMATION_SCHEMA. When you try appending INFORMATION_SCHEMA to the namespace while querying, you encounter the following error.
<catalog-name>.<schema-name>.INFORMATION_SCHEMA.PARTITIONS is not a valid identifier as it has more than 3 name parts.
Where: <catalog-name>
refers to the BigQuery catalog in Databricks.<schema-name>
refers to the dataset in BigQuery.
INFORMATION_SCHEMA is a built-in schema in BigQuery that provides a metadata view.
Cause
Unity Catalog requires queries to be structured with three namespace format (catalog.schema.table
). BigQuery requires its INFORMATION_SCHEMA queries to be structured with .information_schema.partitions
. The BigQuery structure appears as a violation of the three namespace format and triggers the namespace error.
Solution
Create a view in BigQuery that extracts the required metadata from BigQuery INFORMATION_SCHEMA, then query that view from Lakehouse Federation.
- Run the following SQL command in BigQuery to create a view that extracts metadata from INFORMATION_SCHEMA.
CREATE OR REPLACE VIEW <schema-name>.metadata_partitions AS
SELECT * FROM `<schema-name>`.INFORMATION_SCHEMA.PARTITIONS
WHERE table_name = '<table-name>';
- Once the view has been created in BigQuery, you can query it from Lakehouse Federation using the following SQL statement.
SELECT *
FROM <catalog-name>.<schema-name>.metadata_partitions;
The metadata_partitions
view is a standard BigQuery table view. As written, it conforms to the expected three-part namespace format. This allows the view to be queried successfully from Databricks.