Research Session: Lessons from AI-Maxxing

This is the first post in a new monthly series of Research Sessions at Citizen Codex. Once a month, a member of our team will take on a project, question, idea or curiosity, push into unfamiliar territory with whatever methods seem relevant, and bring back what they found to share with the team. The goal isn't to build refined output, production applications or publishable journal articles but to cultivate an inclusive research culture. This is my attempt at that for February.

Just like almost every other organization in 2026, we've been having a lot of conversations at Citizen Codex about AI. How do we realize true efficiency improvements but not give up creative and intellectual control? What guardrails do we need in each of our verticals to ensure that the current state of AI maps on to them effectively. And how do we continue to deliver value for clients using AI and help them navigate these exact questions?

These are meaty questions with rapidly changing answers, much more than this session could tackle alone. But I wanted to at least get started exploring in this direction. My brief to myself was simple: spend a weekend building something real, use AI as much as possible to speed up the process, but be thoughtful about it and return with honest reflections.

‍

The Idea

I've had this idea sitting around for a while. Most of us have email archives going back ten, fifteen, maybe twenty years. Threads that document moves, friendships, career transitions, travel and conversations we've completely forgotten. Email is this incredible repository of lived experience that just sits there, archived.

What if you could surface those stories the way Google Photos surfaces an old photo? You connect your email, and instead of you having to go deliberately recall or find something, the app finds the interesting threads and recounts an interesting moment from your life.

The app works in two stages. First, it scans your email archive using heuristics like thread depth, number of participants and recency to identify candidate stories. No AI involved here, just concrete metrics.

‍

‍

Then, when you click into one of those candidates, an agentic loop kicks off: the agent has access to tools that let it read the email body, explore other threads with the same people, and find conceptually similar conversations. It runs up to 20 research steps, streaming live updates as it goes, and produces a narrative account of that moment in your life.

‍

‍

The Learnings

I'll get the obvious part out of the way: yes, I built a working application in a weekend, and no, that wouldn't have been possible two years ago. But that framing flattens what was actually a pretty mixed experience with nuanced takeaways.

Boilerplate feels like a genuine win. Auth flows, API integrations, standard patterns are just so much faster to implement. I didn't have to learn Google OAuth, I just got it working.
Unguided core logic is a major risk area. Without intentional structuring, new features inevitably introduced confusing patterns and duplication. That debt accumulated fast, fast enough that I'd need to rewrite almost all of the core components before I'd feel comfortable putting my name on it.
Design needs a strong creative vision and system first. The lack of both is transparent here, with the design feeling uninspired and formulaic at best. In addition, without design tokens and components defined upfront, styles are applied ad-hoc and inconsistently. Custom animation in particular is challenging to implement and needs to be hand-coded.‍
Model selection is empirical, not obvious. While coding agents have a limited set of mostly interchangeable options, the landscape is completely different when considering models suited to a custom product. The same prompt sent to different models produces wildly different agentic behavior.‍
Review and understanding is the hidden cost. Building a mental model of the application, organization of the code, logic of the design system and the user journey are still fully bottlenecked by our human ability to process, manage and make decisions on information.‍
The first version teaches you things. Prototyping is extremely valuable and if the purpose of the first version is the validation of an idea, might as well get it out of the way as fast as possible so you can take the learnings to the second version.

The reactions from the team reflected something I think a lot of people feel but don't always say out loud: conflicted ambivalence. Some folks are energized by the productivity gains, others are concerned about creative ownership, client perception, and whether moving fast with AI means quietly trading away the craft that defines our work. A very real and difficult to resolve tension.

What the conversation made clear is that we don't yet have a shared sense of when AI belongs in our process and when it doesn't and that building that understanding together, grounded in real experience rather than hype or anxiety, is what matters most. But we're working towards developing a point of view on AI use that is actually ours, earned through practice, not borrowed from the discourse.

‍

Does X Cause Y?

Research Session: Lessons from AI-Maxxing

I gave myself a weekend to build something real with AI. Here's what I learned.

The Idea

The Learnings

Contact the Newsroom

Further Reads