Caskey

Eric Caskey

Safety-Critical Platform Engineering at Scale

I build platform primitives within a platform engineering org whose mandate is to be a force multiplier for service teams. I own the monitoring platform that standardizes infrastructure monitoring across thousands of services, and built a workflow orchestration control plane now supporting millions of executions across development teams in multiple global regions — enabling engineers to build safety-adherent workflows that meet Amazon reliability standards.

What I Build

Safety Systems

Platform-enforced safety for automated infrastructure operations. Teams build workflows; the platform ensures they meet reliability standards before anything executes.

Platform Monitoring

Infrastructure observability standardized at enterprise scale — 400,000+ monitors at Prudential, and monitoring standards adopted across thousands of services at Amazon.

AI-Augmented Engineering

A spec architecture where AI agents inherit curated context rather than full codebases — the methodology behind caskeycoding.com, applied to production platform infrastructure.

View Case Studies →

Tools

Latest post