Table creation fails with security exception
Problem
You attempt to create a table using a cluster that has Table ACLs enabled, but the following error occurs:
Error in SQL statement: SecurityException: User does not have permission SELECT on any file.
Cause
This error occurs on a Table ACL-enabled cluster if you are not an administrator and you do not have sufficient privileges to create a table.
For example, in your notebook you attempt to create a table using a Parquet data source located on S3:
CREATE TABLE mytable
USING PARQUET
OPTIONS (PATH='s3://my-root-bucket/subfolder/my-table')
Solution
You should ask your administrator to grant you access to the blob storage filesystem, using either of the following options. If an administrator cannot grant you access to the data object, you’ll have to ask an administrator to make the table for you.
If you want to use a
CTAS (CREATE TABLE AS SELECT)
statement to create the table, the administrator should grant youSELECT
privileges on the filesystem:GRANT SELECT ON ANY FILE TO `user1`
Example CTAS statement:
CREATE TABLE mytable AS SELECT * FROM parquet.`s3://my-root-bucket/subfolder/my-table`
If you want to use a
CTOP (CREATE TABLE OPTIONS PATH)
statement to make the table, the administrator must elevate your privileges by grantingMODIFY
in addition toSELECT
.GRANT SELECT, MODIFY ON ANY FILE TO `user1`
Example
CTOP
statement:CREATE TABLE mytable USING PARQUET OPTIONS (PATH='s3://my-root-bucket/subfolder/my-table')
Important
It is important to understand the security implications of granting ANY FILE
permissions on a filesystem. You should only grant ANY FILE
to privileged users. Users with lower privileges on the cluster should never access data by referencing an actual storage location. Instead, they should access data from tables that are created by privileged users, thus ensuring that Table ACLS are enforced.
In addition, if files in the Databricks root and data buckets are accessible by the cluster and users have MODIFY
privileges, the admin should lock down the root S3 bucket.
Granting the data access privileges described above does not supersede any underlying IAM roles or S3 bucket policies. For example, if a grant statement like GRANT SELECT, MODIFY ON ANY FILE TO
user1` is executed but an IAM role attached to the cluster explicitly denies reads to the target S3 bucket, then the
GRANT` statement will not make the bucket or the objects within the bucket suddenly readable.