AWS services fail with No region provided error
Problem Your code snippets that use AWS services fail with a java.lang.IllegalArgumentException: No region provided error in Databricks Runtime 7.0 and above. The same code worked in Databricks Runtime 6.6 and below. You can verify the issue by running the example code snippet in a notebook. In Databricks Runtime 7.0 and above, it will return the ex...
Troubleshooting Amazon Redshift connection problems
Problem You created a VPC peering connection and configured an Amazon Redshift cluster in the peer network. When you attempt to access the Redshift cluster, you get the following error: Error message: OperationalError: could not connect to server: Connection timed out Cause This problem can occur if: VPC peering is misconfigured. The corresponding p...
Vulnerability scan shows vulnerabilities in Databricks EC2 instances
Problem The Corporate Information Security (CIS) Vulnerability Management team identifies vulnerabilities in AWS instances that are traced to EC2 instances created by Databricks (worker AMI). Cause The Databricks security team addresses all critical vulnerabilities and updates the core and worker AMIs on a regular basis. However, if there are long-r...
Configure custom DNS settings using dnsmasq
dnsmasq is a tool for installing and configuring DNS routing rules for cluster nodes. You can use it to set up routing between your Databricks environment and your on-premise network. Warning If you use your own DNS server and it goes down, you will experience an outage and will not be able to create clusters. Use the following cluster-scoped init s...
Unable to load AWS credentials
Problem When you try to access AWS resources like S3, SQS or Redshift, the operation fails with the error: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [BasicAWSCredentialsProvider: Access key or secret key is null, com.amazonaws.auth.InstanceProfileCredentialsProvider@a590007a: The requested metad...
Access denied when writing logs to an S3 bucket
Problem When you try to write log files to an S3 bucket, you get the error: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 2F8D8A07CD8817EA), S3 Extended Request ID: Cause The DBFS mount is in an S3 bucket that assumes roles and uses sse-kms encryption. Th...
S3 part number must be between 1 and 10000 inclusive
Problem When you copy a large file from the local file system to DBFS on S3, the following exception can occur: Amazon.S3.AmazonS3Exception: Part number must be an integer between 1 and 10000, inclusive Cause This is an S3 limit on segment count. Part files can only be numbered from 1 to 10000, inclusive. Solution To prevent this exception from occu...
How to analyze user interface performance issues
Problem The Databricks user interface seems to be running slowly. Cause User interface performance issues typically occur due to network latency or a database query taking more time than expected. In order to troubleshoot this type of problem, you need to collect network logs and analyze them to see which network traffic is affected. In most cases, ...
Unable to mount Azure Data Lake Storage Gen1 account
Problem When you try to mount an Azure Data Lake Storage (ADLS) Gen1 account on Databricks, it fails with the error: com.microsoft.azure.datalake.store.ADLException: Error creating directory / Error fetching access token Operation null failed with exception java.io.IOException : Server returned HTTP response code: 401 for URL: https://login.windows....