Homegrown FinOps Tools: How AI “Build” Beat “Buy” for Us in <1 Year

Self-built AI tools and agents turned a one-person FinOps function into a cost optimization engine. Here’s how.

May 11, 2026

10 min read

Hac Phan

Head of Customer FDEs

Graphic with a blue-to-pink gradient background featuring a rounded rectangular label that reads “Homegrown AI FinOps” alongside a dollar sign icon inside a white circle.

In April 2025, Amplitude officially started its FinOps org. I joined as the first and only FinOps Engineer, and my first big task was to pick our FinOps tool.

Traditionally, this is a no-brainer: for a company of Amplitude’s size and scale, you buy your FinOps platform.

However, April 2025 is also when Amplitude began a strategic pivot into becoming AI-first. That mandate showed up everywhere: in how we build and ship code, how we improve features through the sense/decide/act loop, and even in how we handle day-to-day work questions.

I wanted to rethink how an AI-native company would approach FinOps. Using the FinOps Foundation Framework as a guiding principle, I built my own tools and solved problems the AI way. And after only a year, the AI has already had a significant impact on efficiency.

Here are some of my experiences and what I built during a year of AI FinOps at Amplitude.

How AI changes the FinOps “build vs. buy” question

In FinOps, you need the right tools to ingest billing data from many sources, normalize it, and expose reports and dashboards on top. That’s how you’re able to justify the business value of your company’s tech stack and find ways to optimize it.

A traditional FinOps playbook might encourage us to scale FinOps by procuring a foundational FinOps tool. But with the advent of AI coding assistants, I decided to see if I could build our own economically.

In April 2025, Amplitude hosted an AI Hackathon Week where we learned how to use the new AI coding assistant tools and see what we could accomplish in a week. I was blown away by what they were capable of. The tab completions in GitHub Copilot were magical, but this was on another level.

Just as important as the coding assistants are the internal agents I built. Together, they changed FinOps from manual answering of every question to a system that encodes and reuses knowledge. By my estimate, I saved 50% of my time, allowing me to focus on cost optimization rather than answering questions or researching issues.

First steps: Data foundations and 3 AI Agents, v1

(Note: Agent names are borrowed from my favorite K-pop group. I’ll let you figure out which one.)

Redshift for data infrastructure

The first decision was where to store and normalize our billing data. I needed something that could query AWS Cost and Usage Reports directly in S3 (without building and maintaining a full ETL pipeline) and also let me define a single normalization layer that stays fresh without manual rebuilds.

Redshift checked both boxes: External Schemas (via Spectrum) let me query CUR data in place from S3 via the Glue Catalog, and Materialized Views gave me a single, easy-to-maintain table where all normalization logic lives as a single source of truth. When we needed non-AWS vendor data, I added lightweight Lambda functions to fetch and insert it into Redshift—no new pipeline architecture required. We considered other options, such as Snowflake, RDS, and BigQuery, but ultimately Redshift was the cheapest that met both requirements.

To set this up, I worked with the AI coding assistants to write all the infrastructure-as-code required to maintain the Redshift cluster, the Materialized Views, and the refresh schedule.

Agent YA for Slack-based answers

During our AI Hackathon week, I decided to build our first “AI Slack Bot” to help answer AWS cost-related questions. Reflecting on how I’d do it by hand, the first iteration of YA simply took the user’s questions, generated a SQL query, ran it, and returned the results.

YA was designed with:

The entire schema of mv_normalized_costs in its system prompt, which was about 40 different columns
Tracked conversation “memory” by recording down previous questions and answers injected into the prompt
An explainer agent that would interpret the SQL results and attempt to answer the question

Ask a question in Slack, get an instant breakdown of your top AWS RDS cost drivers.

Initially, YA did some things very well:

It generated pretty complex SQL queries, such as “month-over-month change”
It generated SQL much faster than I could write it

But it also had some drawbacks:

If users weren’t familiar with the schema, it would generally fail to generate the correct query, which was especially true for non-standard tables.
It would require the user to know exactly what values to ask for. For example, if the service was tagged foo_bar but the user asked for foobar, YA would simply return zero rows.
Later, v1.1 would add a “Clarification Step” to clarify the user’s questions before the SQL Generator retrieves the data.

Agent TY for cost anomalies

Cost anomalies were one of the most time-consuming parts of my day. I’d manually check dashboards and try to eyeball what looked off. If something looked fishy, I would have to spend hours digging through multiple data sources to figure out what changed and why. Agent TY was created to automate that entire loop.

TY was designed with:

A vw_cost_anomalies that sits on top of mv_normalized_costs to identify anomalous changes in spending
Multiple views similar to the dashboards I would use to research the issue, such as vw_top_usage_spend_by_service or vw_top_resource_spend_by_service

The agent would identify the anomalous service and try to determine which usage, resource, or other factor caused the anomaly. A report would be created in Slack, structured around five sections: Who, What, Where, When, and How.

AI catches a DynamoDB cost spike, explains what changed, and recommends what to look at next.

Agent YR for reservations analysis

Reservations and Savings Plans are one of the biggest levers for cost optimization, but tracking utilization, coverage, and expirations across multiple AWS services is tedious spreadsheet work. Agent YR was built to automate that analysis and surface actionable recommendations weekly.

Using the AWS SDK, YR pulls utilization and coverage data for Compute Savings Plans, ElastiCache, and RDS alongside existing reservation inventory. Then, it normalizes all instance data to a common unit (xlarge) so we can compare coverage across different instance sizes within the same family—without that normalization, a mix of large, 2xlarge, and 4xlarge instances makes apples-to-apples analysis impossible.

Each week, YR sends a Slack report with current coverage and utilization numbers, flags upcoming expirations, and recommends net new reservations.

AWS reservation health includes wasted spend, expiring reservations, and savings opportunities.

Iterating and improving

Agents v2

The v1 agents worked, but they didn’t scale. Each agent had its own bloated system prompt with baked-in schema definitions, and none could share data or tools. If I added a new table, I had to update every agent individually.

For v2, I refactored around a single idea: turn data access into shared tools rather than embedded knowledge.

I converted YA’s SQL Generator for mv_normalized_costs into a tool that all agents can use.
YA would turn into a single unified agent with a much smaller system prompt and many registered tools. I leave it to the unified agent to decide which tool to use and when.
Other tables would also be wrapped around their own tools, with definitions of their schemas and how to interpret the data.

This had several advantages:

Adding new tables and tools became much simpler. All I had to do was add a new tool with a description of the table’s schema.
It also allowed TY to use the data from mv_normalized_costs to do its work rather than relying on static views to research anomalies. I leave it to TY to call the SQL Generator tool as needed to triangulate the cause of the anomaly.
YR could now analyze current usage and identify specific resources that caused a drop in utilization or coverage. Later on, with the addition of the Datadog MCP, YR will even recommend migrating certain resources from one instance type to another.

In this example, Agent YA reminded me why I had a task to migrate our ElastiCache cluster from r6g. It was able to better recommend what I should do over the following months.

AI-powered ElastiCache analysis explains reservation utilization, renewal risks, and the reasons for creating migration tasks.

Data foundations v2

At this point, it was becoming increasingly difficult to maintain all the separate Lambda functions that would pull from various vendors. So I consolidated all processes into a single service called data-orchestrator:

This would handle all the complex logic for pulling data from multiple sources and recreating any necessary views.
I use AWS Step Functions to orchestrate the flow of the data-orchestrator. This allowed for parallel steps (e.g., fan out and collect data from multiple sources) as well as dependencies (e.g., create this view after all data has been collected).

What we’re building next for FinOps

My goal was to democratize data access for everyone, at any time. With these Agents, I was no longer the bottleneck for analysis or insights. This has freed up at least 50% of my time, allowing me to focus on high-leverage cost optimization.

Learnings

AI coding assistants let me build internal tools at a pace that wouldn’t have been possible two years ago. And because everything is in-house, iteration is fast; what used to take days now takes less than an hour. During the first couple of months, I was pushing out code changes daily.

The future

Right now, our agents can sense issues and make recommendations, but they still require manual changes from engineers. The next step is to move them toward action, enabling agents to pinpoint required changes, submit them as pull requests, and eventually detect and resolve their own errors.

Want to build your own agents? Amplitude's AI Agents can help you get started—and once they’re running, agent analytics lets you measure how well they’re actually working.

About the author