Query your AWS cost recommendations with DuckDB instead of clicking through the console
AWS Cost Optimization Hub has a daily export feature that writes Parquet to S3 in FOCUS format — the FinOps Foundation’s vendor-neutral schema for cloud billing data. Standard column names across AWS, Azure, and GCP, so the data works with any finops tool without transformation. AWS adopted it as an export option, and it’s the cleaner alternative to the legacy CUR format. No CSV downloads, no ETL, no schlepping.
Most people don’t turn it on. They click through the console instead — filtering by action type, sorting by savings, squinting at a table that doesn’t let you group or aggregate.
I turned on the export, pointed it at an S3 bucket, and never opened the console again.
create temporary table recs as select * from read_parquet(
's3://fdsapro-finops/recommendations/CostOptimizationHubRecs/data/date=2025-02-24/*.parquet'
);
select action_type,
round(sum(estimated_monthly_savings_before_discount), 2) as estimated_savings,
count(*) as count
from recs
group by action_type
order by estimated_savings;
Five action types, ranked by monthly savings:
action_type | estimated_savings | count
----------------------+-------------------+-------
Rightsize | 0.36 | 2
Upgrade | 38.52 | 18
Delete | 94.70 | 40
PurchaseReservedInst | 99.89 | 3
MigrateToGraviton | 256.81 | 11
$490 in recommended monthly savings, broken down by effort category, in one query. The console shows you this too — but you have to click to get there, and you can’t reshape it.
The DuckDB wrapper that makes this work
DuckDB’s httpfs extension reads Parquet directly from S3, but it needs AWS credentials. I don’t pass them manually — a shell wrapper injects whatever credentials are ambient in my current working directory:
duckdb() {
if [[ -z "$AWS_PROFILE" ]]; then
echo "AWS_PROFILE is not set."
return 1
fi
aws-export-credentials --profile $AWS_PROFILE --exec duckdb
}
aws-export-credentials resolves the named profile — including SSO sessions, assumed roles, and MFA — into exported AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN environment variables. DuckDB’s httpfs picks them up automatically. No config files, no hardcoded keys, no credential juggling across accounts.
Each project directory on my laptop is a shrine to one AWS context. direnv, two files, one command:
echo dotenv > .envrc
echo AWS_PROFILE=fdsapro >> .env
direnv allow
Now when I cd into that directory, direnv activates the profile. I type duckdb and I’m querying that account’s S3 data. The credentials follow the directory, not the tool.
Before the export
I checked Cost Optimization Hub by logging into the console, navigating to the hub, applying filters, and mentally summing rows. I did this once a quarter, if that, because it was tedious.
After the repeating export is configured, I don’t move files around. The Parquet is ambiently in my S3 bucket — the CUR and FOCUS exports land there daily on different paths. I’m in that AWS account’s local shrine on my laptop, where direnv has already set my AWS_PROFILE. My sauce is mise en place — I type duckdb and query. If I need the bare binary without credentials — local CSV, in-memory scratch — \duckdb bypasses the function. Usually the trigger is miserbot flagging a spend change in its daily Slack report. The SQL is instant, the data is current as of last night’s export, and I can slice it any way I want.
During a finops sprint in January, I ran this query every morning and watched $716 in available savings drop to $490 as I worked through the recommendations. $226/mo captured in two weeks. Not because the recommendations changed — because I could see them clearly enough to prioritize and execute.
Setup: Enable Cost Optimization Hub data export in the AWS Billing console → Data Exports → Create export. Choose FOCUS format, Parquet, daily frequency, and point it at an S3 bucket. The first export appears within 24 hours.
#duckdb #aws #finops #parquet #platformengineering #dataengineering