Specs in, deploys out, no keyboard
This week I shipped two production websites, ericcaskey.com and caskeycoding.com, and moved two personal projects forward: Marathon Coach and Finance Reviewer. All from a phone. Not prototypes. Real sites with CI/CD pipelines, CloudFront distributions, GitHub Actions, and spec packages that drive every session. I wrote no code. I wrote specs.
The pipeline:
voice dictation → Perplexity Computer → spec → Claude Code → PR → GitHub Actions → deploy
The chain runs like this. Voice dictation into Perplexity Computer opens a session. That session writes a spec and opens a PR on GitHub against a private specs repo. A second session in Claude Code on the web, also voice-driven, reads the spec and opens a PR on the code repo with the implementation. GitHub Actions validates. I review. Merge. CloudFront invalidates. The change is live.
The tools earn naming. Perplexity Computer handles planning, research, and spec writing. It has a GitHub connector that reads repos and opens PRs directly, and it has memory across sessions: the one I start tomorrow morning knows what last night's session decided. That persistence is what lets specs evolve instead of starting from scratch every conversation. Perplexity is also particularly strong on finance and stocks research, which is why the specs behind Finance Reviewer were written there specifically. Pick the tool for the domain.
Claude Code on the web handles implementation. It reads a spec and opens a PR with working code. I review asynchronously, typically from the same phone, in a commute or between calls. When CI is green and the diff matches what the spec asked for, I merge. The two tools are complementary by design: one stateful and deliberative, one stateless and fast.
No IDE, no laptop, no desk. No keyboard. A session is a conversation, literally, via voice dictation. That is not a stunt. It is what these tools make possible once the specs are good enough to carry the intent. Last November in my review of the Enterprise Vibe Coding Playbook, I wrote that I had tried voice dictation and felt self-conscious about it. Five months later it is the entry point of the entire workflow.
Things break, of course. A CloudFront function was pinned to trailingSlash: false while the Next.js config flipped to true, and for a few hours every sub-route on caskeycoding.com served the home page. The fix landed in the right order: a spec change first, amending the architecture decision in the infra and frontend specs, and then one-line code changes in both repos behind those specs. The discipline is that the spec leads. The code follows.
The framing matters. This workflow is not "AI did it for me." It is "I made the decisions, wrote them down, and AI implemented them reliably." Restraint is the value. Context architecture is the method. The decisions are mine. The implementation is reproducible because the spec makes it so.
The two personal projects, Marathon Coach and Finance Reviewer, run on the same workflow, specs kept in private development. The interaction model is the portable part.
The methodology under this is spec-driven development and the folder architecture that makes it work, which I wrote up last month. The next post in this series is the follow-up: what a month of running that system in production surfaced that the March essay did not cover. Alongside both, browse caskeycoding-specs-demo to see the shape of the system: two example spec packages, the two ADRs that govern the methodology, and a sample CLAUDE.md.
Keep reading
Watch the agent write
A polish agent drafts an essay against a pre-approved topic.
Multi-Region Workflow Orchestration Platform
Platform supporting millions of executions across multiple global regions, expanding adoption across Amazon.
An orchestration mode is only as good as its backlog
Anthropic published a guide on building a session-level orchestration mode. I built it two ways, on the CLI and on the API, and then hit the part the guide does not cover: an orchestrator that fans out is useless without a backlog of real work to fan out over.
One week of SDD in production: the numbers
The previous two posts made claims. Here is what a week of the workflow looks like as a data trail, PRs, deploys, CI runs, specs merged, pulled from GitHub.
SDD isn't about managing AI agents, it's about managing context
Spec-driven development reads like a methodology for controlling AI agents. It isn't. It's a methodology for managing context across stateless sessions. The spec is the persistent memory.
Context architecture beats documentation dumps
Dumping the whole corpus into an AI agent makes it worse, not better. The fix is architectural: each task loads a curated slice, not everything you have. Here is the method, and the same move at three different layers: specs, sensor data, and evaluation lenses.