`kubectl auth can-i` is the fastest RBAC smoke test I was missing

2026-02-27

I used to test Kubernetes permissions the slow way.

Spin up a pod, run a controller, wait for it to fail, inspect logs, guess which RBAC rule is missing, patch, retry.

That approach works, but it burns time and produces noisy failures in the middle of incidents.

The faster path is kubectl auth can-i.

Why this is better during incident response

can-i asks the API server directly whether a given identity can perform a specific action on a specific resource.

No rollout required. No crash loop required. No speculative pod required.

You can impersonate service accounts with --as=system:serviceaccount:<namespace>:<name> and check the exact verb/resource scope your controller needs.

Example:

kubectl auth can-i \
  list workflows.argoproj.io \
  --as=system:serviceaccount:kubeflow:argo \
  --all-namespaces \
  --context mlinfra-prod

If that returns no, you have a deterministic RBAC blocker before you touch deployments.

Where this helped immediately

In a recent Kubeflow outage, two controllers were crash looping with RBAC forbidden errors.

Instead of continuing log archaeology, we validated permissions directly:

kubeflow/argo workflow access
kserve-controller-manager inference service access

That made drift obvious: role bindings were pointing at different service account namespace placements across clusters.

We aligned bindings, re-ran can-i, then restarted controllers.

Practical smoke test pattern

I now keep a small set of can-i checks for critical controllers and run them after:

cluster upgrades
RBAC changes
controller namespace moves
restore/reconcile operations

I published the script we now use:

kubeflow-rbac-smoke.sh: https://gist.github.com/fizz/2e64204a5fd8767ced6a4ac247aa4b5f

You can keep this lightweight and still get strong signal. Seven focused checks caught the exact class of drift that had already cost us production time.

If the question is “will this controller be allowed to start cleanly,” can-i should be your first command, not your last resort.

#kubernetes #rbac #kubectl #kubeflow

kubectl auth can-i is the fastest RBAC smoke test I was missing

Why this is better during incident response

Where this helped immediately

Practical smoke test pattern

`kubectl auth can-i` is the fastest RBAC smoke test I was missing