June 16, 2026

A fleet of agents you can't read is a fleet you can't trust

Another viral thread says the future is thousands of Claude agents working overnight while you sleep. I built one on my own site this morning and it found a real problem in 10 seconds. Here's why one loop you can read beats a fleet you can't, and the three costs the pitch leaves out.

A practice owner sent me another one this week. The pitch: the person who built Claude Code hasn't written code all year, runs it from his phone, and has a few thousand agents working overnight while he sleeps. The future, the thread says, is loops. Schedule the work, stack up dozens of them, close the laptop, wake up to a team's worth of output.

The mechanism is real. I schedule agents too. The part to watch is the number. The thread sells plurality: thousands of agents, a fleet, the volume of a team from one person. The skill it's actually describing is narrower and quieter, building one loop you can trust. And trust, while you're asleep, comes down to one question. Can you read what it did?

I built one this morning

So I did, on this site, while writing this. One agent, one job: every morning, tell me the state of my open pull requests (the code changes waiting to go live). Passing tests or not, ready to merge or conflicted, how old, what to do about each. Read-only. It reports and recommends. It never merges, pushes, or touches a line.

First run, it found something I'd half-forgotten. 5 pull requests sitting open for a week, all green, all waiting on me. And the live site was running 2 commits behind what I'd already shipped to the main branch. Nothing broken, but what I thought was live wasn't quite what visitors were getting. A real gap, surfaced in the time it takes to read 6 lines. One loop. One job I could check at a glance.

The cost nobody puts in the thread

Here's the first thing the pitch skips. It never says what this costs.

A loop that "watches your CI every few minutes" runs every few minutes whether or not anything changed. Most of those runs find nothing and still cost a full run. Now picture a fleet of them, all night. You're paying, over and over, mostly to confirm that nothing happened.

My digest runs once, in the morning, because my pull requests don't change at 3am. That's the actual craft in scheduling: match how often a loop runs to how fast the thing it watches really moves. A fast feed gets a tight loop. A slow one gets a nightly pass. Run everything every minute and the only number that grows reliably is the bill.

A loop with no memory accumulates

The thread praises one loop as a model of a good one: find every function over 50 lines and open an issue for each. Tight, bounded, clear. It sounds right.

Run it on a schedule and watch. Night one, it opens the issues. Night two, it opens them again, because it has no memory of night one. By Friday your tracker is landfill and you trust none of it.

The fix is dull, which is why the thread skips it. Each run has to either regenerate one thing fresh or remember what it already did. My digest overwrites a single file every morning. Nothing piles up, because tomorrow's report replaces today's instead of stacking on it. Leave a loop running without that and you don't get more output. You get more noise to clean.

"While you sleep" has fine print

The phrase doing the most work in every one of these threads is "while you sleep." It pays to know what that actually buys you.

The loop I set up runs while my app is open, or the next time I open it. That covers a lot. It does not cover the laptop being shut on the counter all night. The truly-closed-laptop version is a different setup, a server running it for you, with its own configuration and its own bill.

Both are useful. They're just different promises. "It runs at 3am with everything off" and "it runs next time I sit down" are not the same sentence, and the pitch blurs them on purpose. Know which one you've got before you lean on it.

Why a practice should want one, not a fleet

For me, a bad digest is a boring morning. The stakes are mine and they're small.

For a practice, the loops worth wanting touch the things that matter: bookings, intake, billing, the messages patients send you. A fleet of those running overnight, unread, is a stack of decisions about real people that nobody looked at. The morning something goes sideways, "I had 30 agents running" is not an answer you want to give a patient.

One loop, one narrow job, an output you can read in 10 seconds before coffee. That's the version you can stand behind when someone asks what it did and why. Build that one. Live with it for a week. Add the second only once the first has earned it.

The skill that actually pays

The thread says the new skill is spotting which parts of your work should run without you. Close. The skill that pays is building one thing you can leave running and still trust at a glance, then doing it again. Two pieces keep that honest, and I've written both: how to make a single loop fail loud instead of lying to you quietly, and how to read the thread that sold you the fleet in the first place.

A thousand agents you can't read is risk you can't see. Start with one you can.

Shaun