← Back to projects

AIMPOSTER

A CalHacks weekend hack, built mostly by steering Claude, around a simple question: can a hidden LLM survive social deduction in a group chat?

AIMPOSTER was a weekend hackathon project at CalHacks 2025, and I think the honest way to describe it is this: it was rough, funny, surprisingly playable, and probably 90% Claude-coded.

That is also why I still like it.

The point was never to pretend this was some deeply hand-crafted software artifact. It was one of those projects where the real challenge was seeing how far we could get, in a very small amount of time, by pushing on an idea that was just interesting enough to carry itself. In this case, the idea was simple: put a hidden LLM into a multiplayer chatroom, tell everyone else there is an AI impostor among them, and see whether the model can survive two rounds of suspicion, conversation, and voting.

In other words: take the social-deduction core of a game like Among Us, strip away the map and movement, and leave only the part where people talk, bluff, accuse, and get it wrong.

What the project actually was

Underneath, the stack was pretty straightforward: Flask, Flask-SocketIO, SQLite, and a vanilla JavaScript frontend. There was a live chatroom, game lobbies, timed chat and voting phases, player elimination, and an LLM that got injected into the room once the round started.

But honestly, that is not the part I find most interesting in retrospect. The more interesting part was the way it got built.

Most of the code did not come from me typing line by line. It came from steering Claude, asking for chunks of functionality, reading what came back, patching what was broken, debugging integration issues, and cutting scope aggressively whenever the clock made the decision for us. It was less like traditional programming and more like trying to conduct a very fast, somewhat unreliable orchestra.

That made the bottleneck less about implementation speed and more about judgment. You still need to know what to ask for. You still need to notice when generated code is subtly wrong, overbuilt, or just incoherent with the rest of the system. You still need to decide what actually matters for the demo and what is just noise. If anything, working this way made taste and direction feel more important, not less.

The one part that turned out to matter

The hardest part was not making the model produce text. That part is easy. The hard part was making it feel believable once everyone in the room knew an AI was present and was actively hunting for it.

A model can look impressive in an ordinary chatbot setting and still fail instantly in a social game. If it responds too often, too quickly, or too coherently, people clock it right away. A bluffing AI needs a kind of behavioral texture that normal demos do not really optimize for.

So the project became less about raw model intelligence and more about interaction design. The AI had to speak only sometimes. It had to wait a bit before replying. It had to avoid suspiciously polished paragraphs. It had to feel like a participant in a room rather than a machine waiting to answer every prompt on command.

That was the part I found most memorable, because it made something very obvious very quickly: there is a real difference between good output and believable behavior inside a live social system.

What building with Claude felt like

AIMPOSTER was one of the first projects where I felt, very directly, that AI-assisted coding was changing the shape of the work. The leverage is real. You can get a ridiculous amount of surface area built in a weekend. But the leverage is weird.

It does not remove the need for technical ability. It changes where that ability shows up.

Instead of spending most of your energy on syntax and boilerplate, you spend more of it on decomposition, steering, review, glue code, debugging, and knowing when to stop believing the generated output. The code comes faster, but the need for discernment does not go away. If anything, it becomes the whole game.

That was probably the main thing I took away from the project. Not that Claude can “build apps for you,” which is the shallow version, but that a lot of the value shifts toward choosing the right abstraction, seeing what is brittle, and keeping a messy fast-moving system coherent enough to survive contact with real users.

Why I still include it

I do not include AIMPOSTER because it was polished. It was not. I include it because it captured a real moment: a small team, a hackathon, a strange but compelling idea, and a workflow where the code was generated fast enough that the human role became much more about direction than manual construction.

It also happened to be fun. There is something genuinely amusing about watching people become suspicious of a chatbot for the same reasons they become suspicious of a human bluffing badly. The project was messy, but it got at a real question I still find interesting: what does it actually take for an AI to hold up, even briefly, inside a human social environment where everyone is looking for the seams?

For a weekend build, that felt like enough.

Tech stack

  • Python
  • Flask / Flask-SocketIO
  • SQLite
  • Vanilla JavaScript
  • OpenAI-compatible LLM API