fizz.today

aws-error-utils gives you specific AWS exceptions instead of catch-all error swallowing

Most GitHub Actions workflows handle AWS errors with || true or continue-on-error: true. The workflow stays green. The logs say “Error: Process completed with exit code 1.” Nobody knows what actually failed.

The problem

I had an ECR image digest resolver in a composite action. It calls aws ecr describe-images to get the sha256 digest for a tagged image. Three things can go wrong:

  1. The image tag doesn’t exist (deleted, never built, typo)
  2. The ECR repository doesn’t exist
  3. IAM permissions are wrong

All three produce different errors from the AWS API. All three need different responses. Swallowing them with || true makes them all look the same: silent failure.

aws-error-utils

pip install aws-error-utils — from Ben Kehoe, who’s been quietly fixing the AWS developer experience one micro-library at a time. If you’ve used aws-sso-util to make SSO bearable, aws-whoami to answer “what account am I in,” aws-export-credentials to stop fighting credential formats, or aws-assume-role-lib to chain role assumptions without losing your mind — this is the same person.

His writing is just as good as his code. His posts on AWS SSO configuration are how I finally understood the difference between ~/.aws/config and ~/.aws/credentials — and more importantly, how to stop putting secrets in my filesystem entirely. Clean profiles, no long-lived keys, SSO-only. If you’re still running aws configure and pasting access keys, go read his stuff before you do anything else.

aws-error-utils does one thing: make boto3 error handling not suck.

from aws_error_utils import catch_aws_error

try:
    resp = ecr.describe_images(
        repositoryName=repo_name,
        imageIds=[{"imageTag": tag}],
    )
    return resp["imageDetails"][0]["imageDigest"]

except catch_aws_error("ImageNotFoundException"):
    print(f"::error::Image {repo_name}:{tag} not found", file=sys.stderr)
    sys.exit(1)

except catch_aws_error("RepositoryNotFoundException"):
    print(f"::error::ECR repo '{repo_name}' does not exist", file=sys.stderr)
    sys.exit(2)

except catch_aws_error("AccessDeniedException", "UnauthorizedAccess*"):
    print(f"::error::Access denied to '{repo_name}'", file=sys.stderr)
    sys.exit(3)

Compare that to the boto3 native way:

except ClientError as e:
    if e.response['Error']['Code'] == 'ImageNotFoundException':
        # ...

Three levels of dict access to get an error code. catch_aws_error makes it a one-liner that reads like English and supports glob patterns ("UnauthorizedAccess*").

Different exit codes for different failures. ::error:: prefix makes GitHub Actions render them as red error annotations in the workflow UI.

In a composite action

- name: Install aws-error-utils
  shell: bash
  run: pip install --quiet aws-error-utils

- name: Resolve digest
  shell: bash
  run: |
    DIGEST=$(python3 ecr_resolve.py "${{ inputs.ecr_repo }}" "$TAG")
    echo "digest=$DIGEST" >> "$GITHUB_OUTPUT"

The Python script is 30 lines. It replaces a bash function that used aws ecr describe-images ... || true and hoped for the best.

The pattern

Catch the specific error. Print what it means. Exit with a code that distinguishes the failure mode. Let the caller decide what to do about it — don’t decide for them by swallowing the error and proceeding with garbage state.

Ben’s libraries all follow this same philosophy: the AWS SDK gives you the building blocks, but the developer experience has sharp edges everywhere. His stuff files the edges down. No magic, no frameworks, just the thing you needed that should have been there from the start.

#aws #python #github-actions #ci-cd