Someone overwrote my ECR tag with a different architecture
A content management deploy that had been green since March 3 failed with no matching manifest for linux/amd64 in the manifest list entries. The image tag hadn’t changed in the ansible playbook. The ECR repo hadn’t been deleted. The tag still existed. But the manifest behind it was now arm64-only, and the target host was amd64.
Between March 5 and March 26, someone rebuilt the image for arm64 (Graviton nodes on EKS) and pushed it to the same tag. ECR accepted it. The old amd64 manifest was gone. The ansible playbook still referenced the tag by name, and docker compose pull got an image it couldn’t run.
The deploy had been working because the host had the old image cached. The cache cleared — or docker compose pulled fresh — and the mismatch surfaced.
The fix was two things. First, switch to an explicit -amd64 tag that existed from a multi-arch build:
docling_image_tag: "086ca47c6753dd669724a749a26174d005c15003-amd64"
Second, enable tag immutability on the repo:
aws ecr put-image-tag-mutability \
--repository-name docling-serve \
--image-tag-mutability IMMUTABLE
Now a push to an existing tag fails instead of silently replacing the manifest. The next person who rebuilds for a different architecture has to use a new tag, which is the whole point.
ECR defaults to mutable tags. Every repo you create starts that way. One put-image-tag-mutability call per repo, and the class of bug where “the tag is the same but the image is different” stops existing.