Caskey

Eric Caskey

Safety-Critical Platform Engineering at Scale

I build platform primitives within a platform engineering org whose mandate is to be a force multiplier for service teams. I own the monitoring platform that standardizes infrastructure monitoring across the fleet, and built a workflow orchestration control plane now supporting millions of executions across development teams in Dublin, Seattle, San Jose, and New York. That platform is on a path to General Availability for all of Amazon, enabling engineers to build safety-adherent workflows that meet Amazon reliability standards.

What I Build

Safety Systems

Pre-execution guardrails that run 11+ concurrent safety checks before every automated infrastructure change.

Platform Monitoring

Fleet-wide observability standardized across multiple global locations with 200,000+ monitors.

AI-Augmented Engineering

Spec-driven development powering 7 specialist AI agents from 137+ structured specifications.

View Case Studies →

Tools

Latest post