Error while establishing user sessions to Oracle database through an external Hive metastore

Increase the session limit on your Oracle database or increase your pool size using an Apache Spark configuration.

Written by alberto.umana

Last published at: December 5th, 2024

Problem

When establishing a connection to an Oracle database through an external Hive metastore, you receive an error message, “exceeded simultaneous SESSIONS_PER_USER limit”

You may notice a high number of Databricks compute resources establishing sessions to the external Hive metastore, persisting connections even when the cluster scales down to a minimal configuration, and/or connections remaining active until a cluster is terminated. 

 

Cause

You have more simultaneous sessions per user than the Oracle database allows. 

 

Solution

Increase the SESSIONS_PER_USER limit on your Oracle database. Oracle recommends the same action. To increase the session limit per user, execute the below query. 

 

ALTER PROFILE your_profile_name LIMIT SESSIONS_PER_USER new_limit;

 

If increasing the session limit on your Oracle database directly is not possible, use an Apache Spark cluster configuration to set your pool size. 

 

spark.databricks.hive.metastore.client.pool.size {max_numer_sessions}. 

 

Important

Limiting the connection pool size may impact performance, especially in multithreaded situations.

 

 

Note

When possible, Databricks recommends using Unity Catalog as the primary metadata store for your workloads. This can help reduce the dependency on the external Hive metastore and minimize the number of connections required. For more information, review the What is Unity Catalog?  (AWSAzureGCP ) documentation. 

 

 

Additional preventative measures

  • If you are using Databricks Runtime versions below 10.4 LTS, and have the Spark setting Spark.databricks.hive.metastore.client.pool.type you are likely using BoneCP. Consider using HikariCP for connection pooling instead. To set up HikariCP, use the Spark setting spark.databricks.hive.metastore.client.pool.type HikariCP. HikariCP is the default option as of Databricks Runtime 10.4 LTS. For more information, review the Databricks Runtime 10.4 LTS  (AWSAzureGCP) documentation.
  • Regularly monitor the number of connections to the external Hive metastore and terminate idle connections. This can help reduce resource contention and improve the overall performance of the system.
  • Implement a cluster management strategy that terminates clusters when they are no longer in use. This can help reduce the number of active connections to the external Hive metastore and free up resources for other workloads.