Table not available while creating AutoML experiment model

Migrate to Unified Compute (UC), with the option to use AutoML with the Python API in the meantime.

Written by kaushal.vachhani

Last published at: September 12th, 2024

Problem 

Your Hive metastore tables are not visible when selecting the Input training dataset in AutoML via the user interface (UI). 

(The navigation path: Workspace → Experiments → Create AutoML Experiment → Experiment Configuration → Input training dataset)

Cause

Clusters with Hive metastore tables have a 2MB limit. If the schema contains thousands of tables, the AutoML UI cannot load – and therefore show – all the tables stored in the schema. 

Solution

Migrate to Unified Compute (UC), which does not have the same 2 MB constraint. 

Note

Generally, Databricks recommends using Unity Catalog instead of Hive metastore. Unity Catalog has enhanced metadata management and governance capabilities. For more information, please review the Upgrade Hive tables & views to Unity Catalog (AWSAzureGCP) documentation. 

 

In the meantime, you can use AutoML with the Python API, which allows you to bypass the UI limitations and specify the desired tables directly in the code. 

To follow the steps to execute, please review the Train ML models with Databricks AutoML Python API (AWSAzureGCP) documentation.