Remember when AI (LLMs) couldn’t decide how many fingers go on the average human hand? That still seems pretty recent. But, my how things have changed. AI is now immensely more capable, and not just at counting to five.

To get a sense of where AI is now, I set out on an experiment. I want to better track my family’s spending, so I gave myself two days to create a web app, with AI’s help. I wasn’t sure how far I’d get, or the quality of the output, but I was curious to find out.

Process

To start, I wrote down the initial features and rough workflow I wanted. I then put my requirements into Claude Code and used the /plan mode to walk through core features and tech considerations.

From there, I iterated feature-set to feature-set. After creating the transaction table, we then added CSV importing, and so on. Claude would create the design and code, I’d review the work, and together we’d iterate on refinements.

Everything proceeded very smoothly. I rarely had to roll back changes. It became almost addictive to think up a useful feature, and then have it ready to use in minutes.

So what did Claude Code (with my guidance) create? Here’s a quick tour of the results:

But for an app to install on my local network and use for myself? It is almost perfect. Instead of paying for another SaaS subscription, I have a near fully-functional app that I own and control. Plus, my data isn’t sitting in some remote database being used for who knows what.

Of course this is far from a “finished” app and not something to release publicly. It has no user accounts, the math is sometimes questionable, and other gremlins are certainly lurking in the shadows.

What worked

Using Claude Code planning mode

Before implementing a bunch of new code, I’d create a short planning document which covers the features I want, how they work, and note any forward thinking code or architecture considerations. I always ended the document with the instructions: “Ask me questions to clarify product or technical requirements, engineering principles, and hard constraints.” This helped both Claude and I nail down what I wanted before getting into code, saving time and tokens.

Being specific on the tech stack

I’m most familiar with Laravel, PHP, and MySQL, so I included these in my initial requirements. I also wanted to keep my code-base simple, so I told Claude to stick with vanilla JS and CSS (no React, Tailwind, etc.). This kept things light and meant I could easily jump into code if needed. In the end, I barely touched the code, though I did feel guilty spending tokens asking Claude to change single lines of CSS now and then.

Working iteratively

In prior tests, I’ve tried “one-shot” prompts to create complete apps, but found it rarely works out well. When an LLM (or human) fill in ambiguity without thinking, weird things tend to happen. Working iteratively meant I could think about the next steps and identify potential opportunities and problems. This reduced rework and saved time.

What I’d do differently

Using a design system

One significant negative of creating from scratch is, well, everything needs to be created. For simple UI components, like buttons, this isn’t a big deal. However, with more complex components, things get messy quickly. This is exacerbated by Claude not being great at small design or interaction details.

For example, my app uses a complex multi-select dropdown in about a dozen places. I often ran into bugs where Claude hadn’t considered an edge case (what happens if the dropdown is in a modal, or it opens down beyond the screen edge). I ran into similar challenges with tables, modals, and other common components.

I spent many tokens fine-tuning these component-level issues. If I had used a good design system, this thinking would have been already baked-in. Plus, with better design defaults, the overall look-and-feel would probably be better.

Prompt for pattern-thinking

Claude Code seems biased toward writing more code rather than broader systems thinking. I found this can cause messy architecture and duplicated code.

For example, in my spending app, you can add a payee in three different locations: in the payee section, when importing transactions, or when editing a transaction. Claude Code created three different approaches (and code for each) for each implementation. They looked quite similar, so I was confused why changes to one weren’t reflected in the others.

While it is easy to prompt Claude to refactor into shared components, creating shared components earlier can save time and prevent rework.

Reflections

There’s a lot of hype around these tools, and for good reason. Many are predicting the end of developers, a SaaS apocalypse, or worse. I think we’re a bit away from these predictions.

I have more than a decade’s worth of experience creating applications, so it is easy for me to prompt for the features, flows, and architecture I wanted. Put these tools in the hands of someone with less experience and they’ll struggle. Tools are being created to help non-technical people create tools like this, but product creation experience will be difficult to design around.

If I could tell you with confidence where we’ll end up, I’d be investing and not writing this post. It is hard to imagine these tools not causing an enormous disruption. The latest lay-offs blaming AI are certainly over-stating the current effectiveness, but the fundamental changes will come. Much depends on how capable the models become, how quickly, and what their true, non-subsidized costs will be. We’ll see. It will be fascinating to watch, at the very least.

Leave a Reply

Your email address will not be published. Required fields are marked *