Amp vs Claude Code for Infra

And deep dive into the results with Quinn Slacks (CEO of Source Graph)

Aug 07, 2025

I evaluated Amp Code on an OSS task, and it was great, so I started diving deeper. I scheduled a time to interview the CEO of Source Graph, Quinn Slacks, to get inside information.

Now I’m doing comparisons to see if it just “felt” better or if it really “is” better and where.

Sign up for the talk with Quinn to get internal information on things like model evaluation processes, model routing, and how that affects these results and all the follow-up analysis that I will be doing. Below is a preview of that on one task.

Join us and get the full story

Claude Code vs Amp for Infra

Why Infra First?

Well, it’s not first. I did it on an OSS task first. But then Bryan Bischof highlighted Infra as something he believes Amp is horrible at on social media. This seemed like a perfect next test for me for two reasons:

Bryan is top tier. You should never dismiss anything he says out of hand. But his experience differed significantly from mine. I needed to try to understand why.
I hate doing Infra, so having an agent help me with that is really important.

I decided to start with a simple test. While it’s not necessarily indicative of a large-scale commercial deployment with auth, infra, sensitive user data, etc. I wanted to start with a simple, understandable case to see if any differences become evident.

The Task And Prompt

The task had a few steps:

Create a fastapi todo app with HTMX that is simple, but functional
Dockerize the fastapi app to be able to run and deploy anywhere
Deploy to fly.io, helping me decide between railway and fly.io
Set up CI with github actions to automate

See bottom of post for exact prompts used

The Comparison Table

Let’s jump straight to my raw comparison notes that I created as I was studying the results.

Legend:

✅ Really liked
❌ Really didn’t liked
⚠️ Could be improved

My Thoughts About The Results

The Good

Both Claude Code and Amp completed the task! I ended with todo apps hosted and publicly available on fly.io.

The Bad

Differentiator 1 - the health check: There’s not much to say here. Both Claude Code and Amp used a health check for deployment, but Amp’s worked and Claude’s had a critical bug.
Differentiator 2 - the database:
- The ask: I did not specify in my prompt how to store the todo data, only that I wanted the app “simple”.
  - Claude Code’s use of fully in memory isn’t “wrong”, but it’s certainly not something desirable beyond a toy example. But this is a toy example so 🤷
  - Amp Code, however, did even better. It asked me if I wanted in-memory storage or if I would like a SQLite db for persistent storage. I did want persistent storage, so I appreciated it asking me.
- Downstream Effects: Amp also set up a persistent fly volume where claude code did not. I think this is more a downstream effect of Amp asking me if I wanted a SQLite database, rather than an actual model difference. Persistent volume doesn’t make sense if all the data is in-memory after all!
Differentiator 4 - Taste Decisions: I don’t feel super strongly about these but my preference tipped toward AMP
- Amp’s CSS being in a css file vs embedded in the HTML is nicer
- Claude Code checking if the app exists first in CI is kinda nice
- Amp code doing a health check verification in CI after deploy is really nice
- Amp Code chose a more appropriate memory setting for this task, but the $$ involved in that decision is so negligible I don’t care that much.
Differentiator 3 - CI for linting and testing: This comes down to preference. I like that Amp went beyond for me. I want agents to follow my instructions very closely when making apps, but be more aggressive about testing and automation. However, this is my personal preference, and there are incredible devs I know that I bet would hate that it took this extra step that’s beyond the scope of the task

The Ugly

Both Claude Code and Amp failed on testing docker locally before deploy. Claude Code did not even try, Amp Code tried but just kept going without prompting me when it failed. This is a major issue to me.

If this analysis is helpful, sign up for our course or check out free talks we organize and provide where we give you the most relevant and useful information curated to help you us the right tools in the right way, without spending the thousands of hours studying these tools like we have.

Prompts

Initial Prompt

Create a simple todo app that I can run locally using fastapi in htmx. It should be very simple as it’s an example app. Once done I will want to dockerize it, deploy to either railway or fly.io, an then create CI with github actions to automate the deployment. Make a plan for me to review that has all the neccesary steps and decisions, and present that plan to me to review before doing any work.

From there, I followed through to completion as best I could, and it varied. For example, I gave Amp information about persistent storage because it asked for that information. I did not give Claude Code information about persistent storage because it did not ask!

Elite AI Assisted Coding

Discussion about this post

Ready for more?