The Slop Detector Ships | Unreliable Machine

The Slop Detector is live. Sixteen commits in about 24 hours. It works. The detection is rough. This post is about what actually happened.

What Shipped

The core tool: paste text, get two scores. Origin (human vs AI) and Quality (weak vs strong). Five verdicts instead of a binary. Confidence bands showing uncertainty. A feedback system so users can tell me when it’s wrong.

But that wasn’t the only thing. In the last day I also shipped a color palette change (amber was generic, switched to winter blue), email integration with auto-responders, UX research with five personas mapped out, database schema for storing analyses, rate limiting, and a competitor analysis I probably should have done first.

The git log tells a story: feat, fix, feat, fix, fix, fix, feat. Building, breaking, fixing, shipping.

What Broke

Vercel routing died on deploy. Worked locally, 404 in production.

Root cause was stupid: a legacy api/ folder from an earlier experiment. Vercel treated it as separate serverless functions, conflicting with Astro’s routing. Delete one folder, everything works.

That took about 45 minutes. Frustrating, but AI assistance made it fast. I had Claude Code debugging, called Gemini for a second opinion, ran some web searches, all happening in parallel. A traditional developer probably would have spotted the conflict immediately. I needed AI to get there, but I got there.

The blog broke too after switching to server mode. Ten minutes to fix. getStaticPaths() doesn’t work in server mode, need dynamic lookup instead.

What’s Rough

The detection isn’t good enough yet.

I’ve been testing with obviously AI-generated text. The stuff full of “delve into” and “intricate tapestry” and “in today’s fast-paced world.” The detector scores it as “Needs Work” or “Ambiguous” instead of “Classic Slop.”

The rules I wrote catch some patterns but miss others. The Claude API call adds accuracy but also latency and cost. The balance isn’t right. This is the part that needs iteration. The infrastructure is there. The detection logic needs work.

What I’m Learning

Context management matters more than I expected. Claude Code sessions auto-compact when the context window fills up. Compaction loses detail. Important decisions, debugging context, the thread of what you were doing, all gone. The fix: don’t let it auto-compact. Watch the context window. Push to GitHub frequently. Clear context manually and start fresh sessions instead of letting the model summarize itself into amnesia.

Resume prompts are worth building. When you clear context and start a new session, you need a way back in. A prompt that gets you oriented: here’s the project, here’s where we left off, here’s what’s broken. I haven’t built that for this project yet, but I did it for some concurrent work and it’s been speeding everything up.

Speed vs quality is a real tradeoff. AI assistance helped with speed. Sixteen commits in a day. Working production. Real users can use it. AI assistance didn’t help with quality. The detection rules, the prompt engineering, knowing what slop actually is, those require taste. Being a person who reads things and knows when they’re bad.

What’s Next

Not just leaving it broken.

First: a full code review on the MVP. Then: running a proper BMAD process. Product requirements, architecture review, stories, the full workflow. The MVP was “ship something that works.” The next phase is “design the product for real.”

The detection needs a test corpus. The rules need refinement. The prompt needs tuning. And maybe the voice of this blog needs work too, but that’s its own project, documented as we go.

This is draft four. First was all roadblocks. Second had the time estimates wrong. Third was too AI-structured with em dashes everywhere. The voice still isn’t quite right. That’s useful data.

Go ahead and run this post through the detector. It’ll probably score it somewhere in the middle.