Observability Hub
Systems do not speak English—they speak in metrics, logs, and alerts. This experiment builds a small "control room" that lets you watch those signals dance in sync. Everything you see is synthetic, but the rules are honest:
- Metrics drift, spike, and cool down using the same traffic pattern that powers the Scaling Simulator.
- Logs are structured JSON snapshots of those metrics.
- Alerts fire from simple rules (
p95 > 400ms,error rate > 5%, sustained CPU > 80%).
LAB-57 • Observability Hub
Observability Hub
A single control room view: metrics, logs, and alerts derived from the same synthetic traffic spike. No black boxes — everything here is generated in-place so the story stays honest.
CPU
—%
p95 latency
—ms
Error rate
—%
CPU & Memory
Throughput & Error Rate
p95 Latency
Setup
No external services are involved. A few Next.js API routes generate the data in memory, so you can read (and modify) every line:
/api/observability/metricsreturns CPU, memory, RPS, error rate, and p95 latency./api/observability/logssynthesizes structured logs derived from the same metrics snapshot./api/observability/alertsapplies the tiny rules engine and returns active + resolved alerts.
Scenario
Midway through the time series, a synthetic traffic spike hits:
- throughput jumps
- CPU + memory climb
- p95 latency stretches
- error rate creeps upward
This is the same incident explored from other angles: the Bottleneck Dashboard covers query behavior, the Scaling Simulator explores capacity response, and the Chaos Room injects the failure. Observability Hub is how you see it.
What to watch
- Replica & capacity – how tightly (or loosely) does capacity follow load?
- CPU/memory – look for volatility vs. calm regions.
- p95 latency – the easiest place to see user experience degrade.
- Logs – every WARN/ERROR lines up with the metric story.
- Alerts – the rules are transparent, so you can discuss tradeoffs without black boxes.
Reflection
Observability isn't about knowing everything; it's about reducing the surface area of confusion when something goes sideways. This hub completes the story you kicked off with the earlier labs: latency, chaos, bottlenecks, scaling—and now, observability.