Troubleshooting Amazon Redshift Connection Problems

Problem

You created a VPC peering connection and configured an Amazon Redshift cluster in the peer network. When you attempt to access the Redshift cluster, you get the following error:

Error message: OperationalError: could not connect to server: Connection timed out

Cause

This problem can occur if:

  • VPC peering is misconfigured.
  • The corresponding port is blocked at the network component level, due to Security Groups (SG), Network Access Control Lists (NACL), or other routing issues.

Troubleshooting

Step 1. Test the connection

Check the AWS console and make sure the Redshift cluster is online in the target VPC. Run the following commands to see if the connection to the cluster can be established:

%sh nc -zv <hostname> <port>
%sh lft <hostname>:<port>
%sh telnet <hostname> <port>

The connection should succeed and show the port as open. If not, go to step 2.

xx.c1prsbaxxxxx.us-west-2.redshift.amazonaws.com [192.168.50.209] 5439 (?) open
../_images/redshift-connectivity-1.png

Step 2. Check for VPC peering or DNS error

If the connection fails with either of the following errors, then the issue is a VPC peering or DNS error.

  1. The following error code indicates a VPC peering issue. Go to Step 3.

    xx.c1prsbaxxxxx.us-west-2.redshift.amazonaws.com [192.168.50.108] 5439 (?) : Connection timed out
    
    .. figure:: /_static/images/cloud/redshift-connectivity-12.png
    
  2. The following error code indicates a DNS lookup error:

    redshift.c1prsbaxxxx.us-west-2.redshift.amazonaws.com: forward host lookup failed: Unknown host
    

In this case, check that:

  • The Redshift cluster is up.

  • The hostname is typed correctly.

  • The redshift cluster IP address works, in place of the hostname.

  • Check the DNS resolution using nslookup:

    %sh nslookup <hostname>
    

If both of these checks appear normal, the error may lie somewhere else. Go to Step 4.

Step 3. Check the VPC peering and DNS settings

If Step 2 revealed a VPC peering or DNS issue:

  1. Validate the peering configuration.

    • Make sure the peering connections from requestor and acceptor VPC IDs are correct. Note the peering connection id, CIDR of Requestor and CIDR of the acceptor.

    • Confirm that the peering connection is active from the target VPC.

      ../_images/redshift-connectivity-2.jpg
    • Make sure DNS resolution check is turned on for the Redshift VPC. When you’re done, go to Step 4.

      ../_images/redshift-connectivity-4.jpg
  2. Check the following components from the Databricks Deployment VPC.

    • Verify that the correct CIDR of the target VPC (Redshift) is added to the route table of the deployment VPC and routed to the correct target, which is the peering connection id.

      ../_images/redshift-connectivity-3.jpg
    • Check the NACL attached to the subnets and allow all traffic to Redshift, for both inbound and outbound rules.

      ../_images/redshift-connectivity-6.jpg
    • Check the Security Group of the deployment VPC. It should be an unmanaged security group. Make sure that port 5439 ( redshift ) is open to the target security group that is attached to Redshift.

      ../_images/redshift-connectivity-5.jpg
  3. Check the following components from the Redshift VPC.

    • Verify the correct CIDR of the target VPC (Databricks deployment) is added to the route table of the deployment VPC and routed to correct target -peering connection id.

      ../_images/redshift-connectivity-8.jpg
    • Check NACL and allow all traffic from redshift (inbound rules and outbound rules).

      ../_images/redshift-connectivity-7.jpg
    • Check the Security Group of the Redshift security group. Make sure that port 5439 ( redshift ) is open to the target security group (the unmanaged security group inside the Databricks VPC).

      ../_images/redshift-connectivity-11.jpg
      ../_images/redshift-connectivity-9.jpg

Step 4. Validate and verify the connectivity between the peered VPCs

Perform the connection test again:

%sh nc -zv <hostname> <port>

The connection test should succeed. If not, contact Databricks Support.

xx.c1prsbaxxxxx.us-west-2.redshift.amazonaws.com [192.168.50.209] 5439 (?) open
../_images/redshift-connectivity-10.png