Concurrent execution of CREATE OR REPLACE FUNCTION statements leads to intermittent ROUTINE_NOT_FOUND errors

Avoid recreating UDFs frequently, implement retry logic, or use explicit DROP and CREATE for UDFs.

Last published at: July 24th, 2025

Problem

You're running a job orchestrated by dbt-databricks, which invokes a Python User-Defined Function (UDF). The job fails consistently whenever executed, particularly in scenarios where multiple UDFs are being created or replaced concurrently.

You receive the following error.

[ROUTINE_NOT_FOUND] The routine `<catalog>`.`<schema>`.`<UC-function>` cannot be found. Verify the spelling and correctness of the schema and catalog. If you did not qualify the name with a schema and catalog, verify the current_schema() output, or qualify the name with the correct schema and catalog. To tolerate the error on drop use DROP ... IF EXISTS. SQLSTATE: 42883

Cause

The CREATE OR REPLACE FUNCTION statements are concurrently executing.

These operations are not atomic – they involve an implicit drop of the existing function followed by the creation of the new one. Delta Lake does not provide ACID guarantees for concurrent DDL statements of this type.

When multiple CREATE OR REPLACE FUNCTION commands for the same function are executed simultaneously, one execution drops the function while another is still referencing or re-creating it. This creates a transient state where the function is neither available nor consistently visible to the querying engine, leading to the ROUTINE_NOT_FOUND error.

This issue commonly arises in parallelized pipelines or orchestrators like dbt, where concurrent runs attempt to redefine the same function.

Solution

Avoid recreating UDFs frequently, implement retry logic, or use explicit DROP and CREATE for UDFs.

Avoid recreating UDFs frequently

Instead of redefining functions every run, define the UDFs once during environment setup and reuse them. This approach is both safer and more efficient.

Implement retry logic

If you can’t avoid concurrency, incorporate retry logic with exponential backoff into your UDF calls. This can help mitigate transient errors due to visibility issues in the system catalog. You can use the following example code. The retries are set to 3 and the backoff factor is set to 0.5.

import time
from databricks import sql

def execute_query_with_retry(query, max_retries=3, backoff_factor=0.5):
    retry_count = 0
    while retry_count < max_retries:
        try:
            with sql.connect(server_hostname="<your-host>", http_path="<your-http-path>", access_token="<your-token>") as connection:
                with connection.cursor() as cursor:
                    cursor.execute(query)
                    return cursor.fetchall()
        except sql.exc.Error as e:
            if "ROUTINE_NOT_FOUND" in str(e):
                retry_count += 1
                time.sleep(backoff_factor * (2 ** retry_count))
            else:
                raise e
    raise Exception(f"Failed after {max_retries} retries")

Use explicit DROP and CREATE for UDFs

Instead of relying on CREATE OR REPLACE FUNCTION, use an explicit DROP FUNCTION IF EXISTS followed by CREATE FUNCTION to avoid transient states that can lead to visibility issues during concurrent execution. You can use the following example code. Be sure to provide your own Python logic.

DROP FUNCTION IF EXISTS <catalog>.<schema>.<function-name>;

CREATE FUNCTION <catalog>.<schema>.<function-name>(<input-type>)
RETURNS <output-type>
LANGUAGE PYTHON
DETERMINISTIC
AS $$
# Your Python logic here
def transform(value):
    # Example transformation
    return some_transformation(value) if value else None
return transform(<input-variable>)
$$;

Databricks Help Center