Three layers of indirection: how API Gateway hides your Lambda version pin
I knew the Lambda was invoking Version 1 instead of $LATEST. I knew Version 1 was frozen without VPC config. What I didn’t know was where the version number lived. There was no architecture doc. There was no one to ask. There was a running cluster and a database I could reach through an ephemeral pod.
The Lambda function name wasn’t hardcoded anywhere I could find. The inference pipeline stores an API Gateway URL in a Postgres deployment table — just a URL, no ARN, no version qualifier. I spun up a throwaway psql container in the same VPC and queried the table expecting to find :1 somewhere in the record. Nothing. Clean URL.
So I pulled the API Gateway integration:
aws apigateway get-integration \
--rest-api-id abc123 --resource-id xyz789 \
--http-method ANY --query 'uri'
arn:aws:lambda:...:function:my-func:1/invocations
There it is. :1, baked into the integration URI when the deployment pipeline created the API Gateway. The API Gateway name had it too — dev-lambda-my-func:1-API. The tags had it — deployment: dev-lambda-my-func:1. Written in three places, visible in none of the usual diagnostic paths.
Postgres → https://abc123.execute-api.us-east-1.amazonaws.com/dev
↓
API GW → arn:aws:lambda:...:function:my-func:1/invocations
↓
Lambda → Version 1 (frozen config, no VPC, ConnectTimeoutError)
Then I found the second invocation path. I didn’t have the source repo, but the pods were running, so I kubectl exec’d into the inference container and read the code off disk. The app has a url2arn() function that reads the API Gateway integration at runtime, splits the URI, and extracts the Lambda ARN for direct boto3 invocations:
arn = integration["uri"].split("/")[-2]
# "arn:aws:lambda:us-east-1:123456789:function:my-func:1"
The :1 qualifier comes along for the ride. So the stale version poisons both the HTTP path (client → API Gateway → Lambda:1) and the direct-invoke path (client → url2arn() → boto3 → Lambda:1). Neither path stores the version itself — both inherit it from the integration URI.
update-function-configuration doesn’t touch API Gateway. publish-version doesn’t touch API Gateway. The version pin survives every Lambda operation and is invisible until you specifically call get-integration.
Updating the integration from :1 to :2 took three commands: update-integration, add-permission on the new version (published versions don’t inherit resource policies), create-deployment to push the stage. 30 seconds to apply, 6 hours to find.
While I was reading code off that container, I found a scale_lambda() function that creates aliases for provisioned concurrency — 50 lines away from the code that created the integration. If the deployment path used my-func:live instead of my-func:1, the API Gateway integration would survive every configuration change without anyone touching it. update-alias is one command. The pattern was right there. Nobody had wired it in.