When the limits hit — eight hours with Pi and qwen3.6

Sunday, May 10, 2026

Saturday night I hit my weekly Claude Code Max limit. Reset wasn't until 7am Sunday. About eight hours offline.

I had Pi and qwen3.6 running locally on the Mac. Dropped in with no expectation it'd carry the work. It mostly did.

Slow — noticeably slow. But the output was pretty good. The kind of code I'd accept on review. Quality was there. Speed was the tax.

A few things from the gap:

Two terminals locked it up. I tried running two Pi sessions in parallel. Within a few minutes, both stalled. One terminal at a time worked — qwen3.6 chewed through tasks slowly, one at a time, and knocked them out.

Small chunks beat big asks. When I tried to one-shot a larger refactor, the model wandered. When I broke it into spec → step → step, it executed each piece cleanly. The discipline of what to build matters more locally than it does with a frontier model that can paper over a loose spec.

The shape of a workflow that might actually work: Claude Code (or any frontier model) for specs and requirements. Pi + qwen3.6 for execution. Spec-driven dev with a frontier model upstream and a local model downstream. The local model doesn't have to be smart — it has to be obedient to a clear spec.

This isn't a primary focus of mine. I'm not racing to replace Claude Code with a local stack. The math doesn't favor it yet — $200 a month for Max is cheap relative to the time it'd take to build a parity setup. But after Sunday morning, I have a backup plan, tested.

If you're really good at staying focused and breaking projects into small pieces, you can be effective with Pi and qwen3.6 running locally today. Most people aren't, and most workflows aren't broken down that finely. The cost of the bigger model is mostly buying you forgiveness for loose specs.

By end of year? Maybe. Local models keep getting better. The hardware can run them. The patterns for breaking work into chunks the model can hold are figure-out-able.

We shall see.