Has anyone else seen TCP connection issues in AWS US East this week?
Over the last week we've seen random TCP connection issues in the US-1 East data center in AWS. Has anyone else been seeing this?
It would help if you defined exactly what type of issues you experienced. Packet loss? Early RSTs? Latency? Single AZ or cross AZ? Same for VPCs or NAT'd internet traffic?
You should take some tcpdumps and open a support case.
Yes I have seen sporadic connection issues with various site scraping functions my app employs. I figured it was a widespread issue but I'm glad you made a post about it that basically confirms that.
One instance was randomly powered down about 22 hours ago.
We saw a synthetic monitor failure at midnight. Investigation of the transaction trace shows that a specific code path that should take maybe ~100ms took almost 40000ms.
It could have been unresponsive EBS. Or failure to look up the Redis server's IP address. Or some other infrastructure-level failure. The synthetic browser saw it as a 502.
Perhaps related: our load tests this week showed an increase in 502s from the ALB. The app server request logs indicate those requests never made it from the ALB.
We had a full `us-west-2` 30 minute network drop-out this week. CloudTrail shows nothing.
We are seeing sporadic connection issues where tcp syn packets are dropped before reaching our elb. Have noticed off and on for a few weeks now. Still investigating and have support ticket out with aws.
Not as much as TCP issues, but increased API call failure, across multiple services (cloudformation, ec2, rds), yes, on us-east-1. Mind you, still pretty low rate, but enough to notice some pattern.
We had a few minutes earlier this week where a machine saw packets in/out go to zero for no discernible reason.
I've not seen anything across 4 AZs
Someone asked for a region where everything breaks all the time.
Yep, had issues yesterday. Botched a big deploy for me too.
In any specific AZ or across the board?