Problem
You are trying to use dbutils
inside a User-Defined Function (UDF) to retrieve credentials from a Databricks scope or secrets.
You receive a “permission denied” or “execution of function failed” error indicating a permissions issue, even though you have access to the secrets.
PermissionError: [Errno 13] Permission denied
Or
[UDF_USER_CODE_ERROR.GENERIC] Execution of function <catalog>.<schema>.dbutils_test() failed.
== Error ==
ValueError: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method.
When you review the stack trace, you see the following detail.
== Stacktrace ==
File "<file>", line 3, in main
from databricks.sdk.runtime import spark
File "/databricks/python3/lib/python3.12/site-packages/databricks/sdk/runtime/__init__.py", line 172, in <module>
dbutils = RemoteDbUtils()
^^^^^^^^^^^^^^^
File "/databricks/python3/lib/python3.12/site-packages/databricks/sdk/dbutils.py", line 194, in _init_
self._config = Config() if not config else config
^^^^^^^^
File "/databricks/python3/lib/python3.12/site-packages/databricks/sdk/config.py", line 127, in _init_
raise ValueError(message) from e SQLSTATE: 39000
You notice that the UDF works if you use hard-coded credentials.
Cause
You cannot use dbutils
inside a UDF because the UDF runs on an Apache Spark worker node. Attempting to use dbutils
inside a UDF causes the code to fail to execute, resulting in a permission denied error.
Solution
Fetch secrets on the driver before invoking the UDF, and pass them as function arguments. This pattern ensures the UDF receives credentials securely without directly calling dbutils
inside worker-executed code.
Retrieve database credentials outside the UDF on the driver
- Create a secret scope. Follow the Tutorial: Create and use a Databricks secret (AWS | Azure | GCP) documentation.
- Run the following code in a notebook to retrieve the database-related secrets saved in your Databricks secret scope.
%python
database_username = dbutils.secrets.get(scope="scope-database", key="database-username")
database_password = dbutils.secrets.get(scope="scope-database", key="database-password")
database_name = dbutils.secrets.get(scope="scope-database", key="database_name")
database_host = dbutils.secrets.get(scope="scope-database", key="database-host")
```
Create a sample SQL UDF with arguments
Note
This step will look different depending on your database connector library. This article uses PyMongo as an example. For other common database Python connector libraries, refer to the following documentation.
- Oracle: Connecting to Oracle Database
- Redshift: Examples of using the Amazon Redshift Python connector
- Postgre: Basic module usage
- Ensure your database Python connector package is already installed on the cluster. You can run
%pip install <database-connector>
(for example,pymongo
) if not. - Create the SQL UDF using the database credentials you created and fetched in the previous section.
%sql
CREATE OR REPLACE FUNCTION <your-catalog>.<your-schema>.<your-test-udf>(
database_username STRING,
database_password STRING,
database_name STRING,
database_host STRING
)
RETURNS STRING
LANGUAGE PYTHON
AS $$
import pymongo
def main(database_username, database_password, database_name, database_host):
# Connect to the database
uri = f"mongodb+srv://{database_username}:{database_password}@{database_host}/{database_name}?retryWrites=true&w=majority"
client = pymongo.MongoClient(uri)
# Access collection and return value from first document
doc = client[database_name]["your_collection"].find_one({}, {"your_field": 1, "_id": 0})
return doc.get("your_field") if doc else None
$$;
Use the UDF in a SQL query
Once defined, you can call the UDF from any SQL query, passing arguments dynamically using Python or manually in SQL.
%python
sql_query = f"""
SELECT
<your-catalog>.<your-schema>.<your-test-udf>(
'{database_username}',
'{database_password}',
'{database_name}',
'{database_host}'
) AS value
"""
df = spark.sql(sql_query)
df.display()
For more info on SQL UDFs, refer to the CREATE FUNCTION (SQL and Python) (AWS | Azure | GCP) documentation.