Problem: Table Creation Fails with Security Exception

Problem

You attempt to create a table using a cluster that has Table ACLs enabled, but the following error occurs:

Error in SQL statement: SecurityException: User does not have permission SELECT on any file.

Cause

This error occurs on a Table ACL-enabled cluster if you are not an administrator and you do not have sufficient privileges to create a table.

For example, in your notebook you attempt to create a table using a Parquet data source located on S3:

CREATE TABLE mytable
  USING PARQUET
  OPTIONS (PATH='s3://my-root-bucket/subfolder/my-table')

Solution

You should ask your administrator to grant you access to the blob storage filesystem, using either of the following options. If an administrator cannot grant you access to the data object, you’ll have to ask an administrator to make the table for you.

  • If you want to use a CTAS (CREATE TABLE AS SELECT) statement to create the table, the administrator should grant you SELECT privileges on the filesystem:

    GRANT SELECT ON ANY FILE TO `user1`
    

    Example CTAS statement:

    CREATE TABLE mytable
          AS SELECT * FROM parquet.`s3://my-root-bucket/subfolder/my-table`
    
  • If you want to use a CTOP (CREATE TABLE OPTIONS PATH) statement to make the table, the administrator must elevate your privileges by granting MODIFY in addition to SELECT.

    GRANT SELECT, MODIFY ON ANY FILE TO `user1`
    

    Example CTOP statement:

    CREATE TABLE mytable
       USING PARQUET
       OPTIONS (PATH='s3://my-root-bucket/subfolder/my-table')
    

Important

It is important to understand the security implications of granting ANY FILE permissions on a filesystem. You should only grant ANY FILE to privileged users. Users with lower privileges on the cluster should never access data by referencing an actual storage location. Instead, they should access data from tables that are created by privileged users, thus ensuring that Table ACLS are enforced.

In addition, if files in the Databricks root and data buckets are accessible by the cluster and users have MODIFY privileges, the admin should lock down the root S3 bucket.

Granting the data access privileges described above does not supersede any underlying IAM roles or S3 bucket policies. For example, if a grant statement like GRANT SELECT, MODIFY ON ANY FILE TO `user1` is executed but an IAM role attached to the cluster explicitly denies reads to the target S3 bucket, then the GRANT statement will not make the bucket or the objects within the bucket suddenly readable.