Smith: the coding agent learning to own my SDLC

Thursday, May 28, 2026

PROJECT LIFECYCLECURRENT · BUILD

Build — the worker daemon and admin shell are the active surface. Smith advises today; he ships tomorrow.

The goal of this project is not "AI that helps me code." That bar is too low. The goal is an agent that owns the development lifecycle — discovery, design, implementation, testing, review, ship — with as little of me in the loop as I can responsibly remove. Mastery is the prerequisite. Automation is the point.

This agent is called Smith. Blacksmith. He forges — designs, hammers, refactors, ships. The name is the discipline.

What Smith is today

Smith is the engineering domain of my personal operating system. He has nine operating modes, a couple dozen runnable procedures, sixteen declarative reasoning modules, and four non-negotiables he will not violate no matter how the conversation goes.

The non-negotiables are the spine:

SPINE — ALWAYS

TDD — tests first; no production code without a failing test.
Clean code — names reveal intent; functions do one thing; Boy Scout rule.
DB logic in the database — stored procs are the data-layer API; app code calls them.
DSPy-first — every LLM call goes through a typed module signature, never a freeform string.

WHAT THIS BUYS

An agent that follows the same rules I would enforce in code review.
Diffs that are reviewable because the contracts are explicit.
Reasoning that compiles, evaluates, and improves over time.
An agent I can trust to do my job, not impersonate it.

The modes are the verbs — Audit, ProductDiscovery, Requirements, Architect, Review, TestPlan, StackAdvise, TechRadar, Postmortem, ObsAudit. Each one is bound to a playbook that Smith reads before he acts. Acting without reading the playbook is a self-finding in the next code review. The discipline is recursive.

What's automated, what's still on me

The hardest part of building a coding agent is being honest about what it can do and what it can't. The map below is the current state.

↗ click to enlarge

The pattern is clear. The judgment surfaces — design, review, observability, retrospective — are already automated to my standards. The mechanical surfaces — implement, ship — are the gap. And that gap is on purpose the last thing built, because automating implementation before automating judgment produces a code-generator, not an agent.

How Smith runs today vs. where it's going

Today Smith runs inside my editor. I open Claude Code, type a request, and Smith responds in the same conversation thread. Useful, but the loop is bounded by my attention — every step needs my eyes.

The target is a Kanban-driven runtime. I file a ticket in Plane, move it from Backlog to Todo, and a small dispatcher reads it, gates the spec quality, and enqueues a job for Smith. Smith claims it, runs the ReAct loop, commits, opens the PR, and the dispatcher moves the card to Review and writes a status comment. I see one place — Plane — to file work, watch progress, and approve.

Two agents, two brains, two jobs. The dispatcher is small and cheap — it runs on qwen3:8b locally, makes routing and gating decisions in milliseconds, never touches code. Smith is the executor — runs on qwen3.6:27b-coding, does the heavy reasoning, owns the SDLC work. Cost discipline says: don't put a 27B coding model on a job a fast classifier handles.

↗ click to enlarge

The pieces are coming together. The jobs-table contract is live — a Postgres database with stored-procedure-only access, two roles (one for the API, one for the worker), no inline SQL anywhere. The next two pieces are the dispatcher — Plane’s translator, the spec-quality gate, the small brain — and the Smith worker daemon that actually claims jobs and ships code. Two agents, one Kanban, one lifecycle.

Why this matters

I am one person. Every minute I spend writing boilerplate, running test suites, opening PRs, watching CI go green — that is a minute I am not spending on the things only I can do. Strategy. Sales. The decisions that an agent shouldn’t make on my behalf.

Smith’s job is to take the executable surface of my work and run it. My job is to be the orchestrator — the strategic layer that no agent can replace because no agent has my reputation, my client relationships, or my taste.

This is what the lifecycle automation buys: not less work, but the right work, done by the operator who should do it.

There is more to say — about the brain layer (the DSPy modules that compile and evaluate over time), about how cost discipline runs locally first and spills to cloud only when the task earns it, about the playbook system that keeps Smith honest. Future posts will get to those. This one is the map. The agent is the territory.

Smith: the coding agent learning to own my SDLC

What Smith is today

What's automated, what's still on me

How Smith runs today vs. where it's going

Why this matters

Continue reading

Loop engineering without the cloud bill

Slice 1 of the voice learning loop is live

Teaching an agent to draft in my voice