A client's PostgreSQL writes were getting slower every quarter. The table had 57 indexes. Only 14 of them were ever used. Every INSERT and UPDATE was paying a tax nobody had thought to audit.
A client's API was getting measurably slower every week. The dashboards were green, the alerts were silent, and the database looked healthy. The problem was hiding in plain sight — on the container's local disk.
A client's API started throwing 500s every weekday afternoon like clockwork. The database was fine. The queries were fast. The problem was a reporting job that quietly hogged every available connection during peak traffic.
A client's dashboard took 11 seconds to render. Everyone blamed the database. The real problem was an ORM doing exactly what we told it to — we just never looked at what that meant.
We ran load tests before a big product launch, got green across the board, and watched the system buckle under real traffic two days later. The tests weren't wrong — they just weren't testing reality.
We added Redis to fix slow API responses. Instead we got stale data, thundering herds, and a system that was harder to debug than the original problem.