fizz.today

EKS private endpoints need a security group rule for Headscale

Switched our dev EKS cluster to private-only API endpoints. kubectl immediately stopped working over Headscale, even though DNS resolved correctly to the private endpoint IPs.

The symptom

$ kubectl get nodes
Unable to connect to the server: dial tcp 10.50.118.197:443: i/o timeout

DNS was fine. The private endpoint IPs resolved correctly. But TCP connections to port 443 just hung.

The cause

The EKS cluster security group only allows ingress from itself. That’s the default — nodes in the cluster can talk to the API server, and that’s it.

Our Headscale subnet router sits in a peered VPC (172.31.0.0/16). Traffic path looks like:

laptop → Headscale → subnet router (172.31.x.x) → VPC peering → EKS endpoint (10.50.x.x)

That last hop — from the peered VPC into the EKS endpoint — gets dropped. The security group doesn’t recognize traffic coming from 172.31.0.0/16.

The fix

Add one ingress rule to the EKS cluster security group:

Allow TCP 443 from 172.31.0.0/16

That’s it. kubectl works immediately.

Why this is easy to miss

When EKS is public, kubectl goes over the internet and the security group isn’t involved. Switch to private-only and suddenly the SG matters for all API access, including traffic arriving via VPC peering from your VPN subnet router.

The mental model of “Headscale gives me access to the VPC” is correct but incomplete. You have access to the network, but security groups are still enforced at the ENI level. Peered VPC traffic is not “inside” the security group — it’s external traffic that needs an explicit allow.

Same fix applies to prod. Don’t forget both clusters.

#aws #eks #headscale #networking