Tell Me Everything That's Wrong: Validation as a Batch Operation
There is a particular kind of slow torture that software inflicts on its users, and most of us have stopped noticing it. You submit a form, a config file, a deploy. It rejects the first thing it finds. You fix that one thing and submit again. It rejects the second thing. You fix that. It rejects the third. Each round trip costs you a context switch, and if the feedback loop runs over a network or a build, each one costs you minutes you will not get back.
The system knew about all three problems the first time. It just didn't tell you.
I keep coming back to CloudFormation as the counter-example. When you submit a template with ten mistakes, it does not stop at the first missing property. It validates the whole thing and hands you back the full set. You fix ten things in one pass and move on. That is not a cosmetic nicety. It is a design decision about whose time matters, and it is the difference between a tool people trust and a tool people dread.
The Cost You're Actually Optimizing#
The instinct to fail on the first error comes from a good place. In a hot request path, fail-fast is correct: stop early, shed load, don't waste cycles on work you're going to throw away. But validation is not a hot path, and the resource you're spending is not CPU. It's human attention.
The cost model is the whole argument. When you fail fast, you are optimizing for the machine's time on the unhappy path. When you collect every error, you are optimizing for the human's time across the entire fix cycle. A first-error-only validator turns a single ten-second review into ten separate submit-wait-read-fix loops, and the wait in the middle is where trust goes to die. People start to experience your system as flaky even when it is behaving deterministically. The real problem is partial reporting, but that is not how it feels from the outside.
This is the same principle that makes fail-fast the wrong default for safety guardrails: the goal is not to exit quickly, it is to surface the full set of blocking issues so a person can fix them in one pass. Validation is that idea pointed at the user instead of the operator.
Collect, Then Decide#
The structural change is small and it is always the same shape. Instead of returning the moment a check fails, you run every independent check, accumulate the failures, and make the proceed-or-reject decision once, at the end, against the complete set.
def validate_order(order):
errors = []
if not order.get("customer_id"):
errors.append(Error("customer_id", "required", "Customer ID is missing"))
if order.get("quantity", 0) <= 0:
errors.append(Error("quantity", "out_of_range",
"Quantity must be greater than 0"))
if order.get("ship_date") and order["ship_date"] < today():
errors.append(Error("ship_date", "in_past",
"Ship date cannot be in the past"))
if errors:
raise ValidationError(errors) # all of them, not the first
return order
The trap to avoid is the early return or raise buried inside each check. The instant one check can short-circuit the function, you are back to first-error-only and you have quietly recreated the torture. Independent checks must not be allowed to serialize the user's time.
Don't Drown Them in Cascades#
There is a failure mode on the other side, and it's worth naming because it scares people away from batching in the first place. If you naively run every check against malformed input, one root cause can spawn fifty downstream errors. A single missing closing brace in a config file, reported as forty "unexpected token" errors, is worse than failing fast. You have traded a short loop for a wall of noise that buries the one thing that matters.
The answer is phased validation. You validate in layers, and you only advance to the next layer once the current one is clean. Parse first. If the syntax is broken, report the syntax errors and stop, because nothing downstream is trustworthy yet. Once it parses, run the structural and type checks as a batch. Once the shape is valid, run the semantic and cross-reference checks as a batch. Within each phase you report everything; between phases you gate, because errors in a later phase are only meaningful when the earlier phase held.
Compilers have done this for decades, and it's why a good one gives you a screen of real errors instead of one cryptic line or a thousand garbage ones. The skill is knowing which checks are independent, so they can be batched, and which are derived, so they should be suppressed until their precondition holds.
Make Each Error Worth Reading#
Batching only pays off if the batch is legible. Ten errors that each say "invalid input" are not ten times more useful than one. A useful error answers three questions: what is wrong, where it is, and what would make it pass. If a check can't answer those, it isn't a validation result yet, it's a boolean, and booleans are how you lose trust faster than you lose data.
That means structured errors, not concatenated strings. Give each one a stable machine-readable code, a path or pointer to the exact location, and a human-readable message. The structure is what lets a UI highlight all the bad fields at once, a CLI print a tidy table, and a CI job fail with a diagnostic someone can act on without re-running anything.
{
"valid": false,
"errors": [
{ "path": "items[2].quantity", "code": "out_of_range",
"message": "Quantity must be greater than 0", "got": -1 },
{ "path": "ship_date", "code": "in_past",
"message": "Ship date cannot be in the past", "got": "2026-06-01" }
]
}
A path like items[2].quantity is the difference between "your order is invalid" and a cursor the user can jump straight to. Multiply that across a batch and you've turned a guessing game into a checklist.
Where the Line Is#
Batching is the right default for validation, and it is genuinely wrong in a few places, so hold it honestly. Authentication should fail fast and say little; enumerating everything wrong with a login attempt is a gift to an attacker. Anything with side effects that compound, or any check that is genuinely expensive and gated behind a cheap one, belongs in a later phase or behind fail-fast on purpose. And when checks are truly dependent, suppress the derived ones rather than reporting noise.
But for the ordinary case, the form, the config, the API payload, the deploy, the bar is simple. Before you ship a validator, submit something with three mistakes in it. If it only tells you about one, you haven't finished building it. You've just moved the rest of the work onto whoever has to use it.
Related#
- Designing Safety Guardrails for Distributed Workflow Orchestration: why fail-fast is the wrong default for safety systems, and parallel evaluation with aggregation
Keep reading
Designing Safety Guardrails for Distributed Workflow Orchestration
Patterns for pre-execution safety checks, parallel validation, opt-out design, and extensible guardrail architecture on workflow platforms.
Ten days of June: the SDD velocity numbers, seven weeks in
In April I published one week of SDD production numbers. The same data trail rerun for June 1 through 10 shows the velocity curve: 309 PRs opened, 293 merged, about 185 production deploys, and one footnote about outrunning GitHub Actions' default limits.
Autonomy is mostly knowing when to stop
I handed a backlog to Claude Fable, told it once it could merge, and let it run. It shipped seventeen items across five repos. The line that mattered was not in the work it finished. It was in the work it refused to touch.
How to backtest without fooling yourself
A backtest's job is not to find an edge. It is to stop you from believing in one that is not there. The toolkit I used to test my own trading engine, and the part where it killed my single best signal.
Goodbye Opus, Hello Fable
Anthropic shipped Claude Fable 5 and Mythos 5: same model, two names, one safeguard layer apart. What the new frontier model means for running agents in production.
Context architecture beats documentation dumps
Dumping the whole corpus into an AI agent makes it worse, not better. The fix is architectural: each task loads a curated slice, not everything you have. Here is the method, and the same move at three different layers: specs, sensor data, and evaluation lenses.