A client's dashboard took 11 seconds to render. Everyone blamed the database. The real problem was an ORM doing exactly what we told it to — we just never looked at what that meant.
We ran load tests before a big product launch, got green across the board, and watched the system buckle under real traffic two days later. The tests weren't wrong — they just weren't testing reality.
We added Redis to fix slow API responses. Instead we got stale data, thundering herds, and a system that was harder to debug than the original problem.