I build platform primitives within a System Development Organization at Amazon whose mandate is to be a force multiplier for service teams across Amazon. I own the monitoring platform that standardizes infrastructure monitoring across the Amazon fleet, and built a workflow orchestration control plane now supporting millions of executions across development teams in Dublin, Seattle, San Jose, and New York. That platform is on a path to General Availability for all of Amazon, enabling engineers to build safety-adherent workflows that meet Reliability org requirements.
What I Build
Safety Systems
Pre-execution guardrails that run 11+ concurrent safety checks before every automated infrastructure change.
Platform Monitoring
Fleet-wide observability standardized across multiple global locations with 200,000+ monitors.
AI-Augmented Engineering
Spec-driven development powering 7 specialist AI agents from 137+ structured specifications.
Designing Safety Guardrails for Distributed Workflow Orchestration If you automate changes across a large fleet, you will eventually face the same uncomfortable question: what happens if we run this at the wrong time, ag…