The caskeycoding.com tech stack at a glance

By Eric Caskey · May 25, 2026 · 3 min read

This site is a Next.js static export on CloudFront, a small Python service on Lambda, a single DynamoDB table, an S3 content bucket, and two AI agents calling Anthropic directly. Everything is wired up with AWS CDK across four stacks split by deploy cadence: auth, agents, backend, frontend.

Here is the stack, layer by layer.

Frontend#

Next.js with static export, TypeScript, trailingSlash aligned with the edge rewrite
Playwright for end-to-end tests
Lighthouse CI for performance budgets

Edge / CDN#

CloudFront distribution fronting a private S3 site bucket via origin access, HTTPS-only
CloudFront Functions at viewer-request for extensionless URI rewrites and one retired-slug 301
Route53 hosted zone with apex / www aliases plus Google Workspace MX, SPF, and DMARC (p=quarantine)
ACM for the TLS certificate

API & compute#

Amazon API Gateway for the dynamic surface
AWS Lambda (Python) for the blog handler, the public demo handlers, the agent API, and the long-running orchestrator
AWS WAF with regional, per-route rate rules on the unauthenticated /public/* endpoints

Data#

Amazon DynamoDB: single-table design, postId + type keys, post and agent_task items
Amazon S3: transparent content offload at 2KB so DynamoDB items stay small and read-cheap

Auth#

Amazon Cognito user pool, shared via cross-stack reference to the backend

AI / LLM#

Anthropic API (direct, official Python SDK) as the primary path
- Sonnet 4.6, workhorse: generation, polish, tool loops
- Opus 4.7: synthesis and multi-source reasoning
- Haiku 4.5: routing, classification, eval-judge
Amazon Bedrock: fallback only, via a cross-region inference profile, with a Discord webhook alert when it engages
Shared client (anthropic_client.py) owns retries, fallback, secrets, and cost accounting; agents stay thin

Secrets & config#

AWS Secrets Manager for the Anthropic API key: no env-var secrets in prod
AWS Systems Manager / CDK context for non-secret config

Observability#

Amazon CloudWatch logs, metrics, and alarms (5xx on the static path → SNS topic)
Amazon SNS for paging
Discord webhooks for human-in-the-loop alerts (LLM fallback, eval drift)
Structured llm_call JSON logs: full payloads only in non-prod; prod carries metadata plus a prompt_sha256

Testing & CI#

pytest for the Python service
Replay / eval harness: YAML cases with cached completions, runs on every PR with no API spend; a --live mode for capturing new fixtures
CI pricing check: fails the build if the in-repo per-model pricing table is older than 90 days

Infrastructure as code#

AWS CDK (TypeScript), four stacks split by blast radius:
- AuthStack: Cognito
- AgentStack: agent and orchestrator Lambdas, Bedrock IAM
- BackendStack: API Gateway, blog Lambda, DynamoDB, S3 content bucket
- FrontendStack: CloudFront, site bucket, Route53, ACM

The specs that drive this site, including the architecture decisions behind each of these choices, are public in the specs demo repo.

Keep reading

Post

Prompt caching is a prefix match, not a flag

Prompt caching looks like a flag you flip for a cheaper bill. It is really the reuse of a stored prompt prefix, governed by three rules, and applying it across four parts of my own system showed where it pays, where it quietly does nothing, and where it is not even my decision. With the token counts I measured to check.

Read

Post

A One-Day Security Baseline for a Solo Fleet

You cannot out-staff a security team when you are the whole team. But the failures that actually end a solo operation are a short, known list, and each has a cheap defense you set up once. Here is the catastrophic floor I stood up in an afternoon.

Read

Post

When CI Costs More Than It Saves

GitHub Actions' default minute allowance is priced for a team that types at human speed. At agent velocity the bill breaks before the engineering does. Here is how a forced workaround, a local CI mirror plus local deploys, became the better default.

Read

Post

Welcome: Building Platforms for Scale

Where this blog started: owning enterprise monitoring at Prudential and Amazon, an automation mishap that paged a whole support queue for ten minutes, and the throughline that still runs through everything I build, make the safe path the only path and then prove it.

Read

Post

Fifteen Million Was the Easy Part

I wrote a C++ options pricer to learn low-latency numerics. The first clean version priced fifteen million options a second; getting to 215 million was less about clever code and more about being wrong, in public with myself, about where the speed would come from. This is the why, the what, and the how.

Read

Post

Rotating an Option

A 3D render crossed my feed once and stuck with me, so I tried to see an option the same way: as a surface I could grab and turn, not a number. That turned into five market visualizations on one shared trick, a compliance rule the architecture enforces by accident, and an honest lesson about wanting a crystal ball and getting understanding instead.

Read

Follow the work

New tools and writing as they ship — pick a channel.

RSS feed LinkedIn

Written by Eric Caskey. I build AI tools you can actually use. Explore the Tools or see the case studies.