CASE STUDY
Imagine:
United.com crashes on Thanksgiving. Engineers open Splunk.
Dashboards to check
Tabs open at once
Lost per minute
When I inherited this dashboard, I saw the same panic every time.
Engineers weren't slow because they were bad at their jobs. They were slow because the system made them hunt for answers.
BEFORE: The Scavenger Hunt

"Check Tab 1... not here. Tab 2... maybe? Tab 3... wait, go back to Tab 1 and cross-reference..."
⏱️ Average time to start fixing: 12 minutes

15 metrics. — engineers had to already know what to look for
8 panels — nothing said "start here"
And this was just one tab.
The old RUM was a library — everything was there, but nothing said "start here."
I connected the dots for them — correlated signals in one view so engineers can see, judge, and act without memorizing 10 tabs or holding context in their head.
I led design on this project, partnering with a principal designer from another team — we shared a layout system across products. No time for large-scale research. So I mined 2 years of support tickets and talked to engineers who'd lived through real outages.
The pattern was clear: they weren't slow because they lacked information. They were slow because the system didn't tell them where to start.
The question wasn't "how do we show more?" It was "how do we show less — but the right things first?"
AFTER: The Control Tower

⏱️ Time to first action: 3 minutes

Health check at the top — Page views and Duration tell you immediately if something's wrong
URLs ranked by impact — see which pages are affected
Browser/OS breakdown grouped together — no more tab switching
I cut the default view down to what actually drives decisions under pressure. The top of the screen answers: is something wrong? The middle answers: where? The bottom lets you drill in if you need to.
Clarify first. Expand later.
The Dropdown Reorganization Diagram
The dashboard wasn't the only problem. Even finding the right metric was a guessing game.

BEFORE Left: "Which of these 15 is right?" / AFTER Right: "What's your role? Start there."
Before, the dropdown was a flat list of 15 technical metrics. Engineers had to already know the answer to pick the right one.
I reorganized it by role: UX, Frontend, Backend, Network. Now engineers start from what they know — their job — and narrow from there.
faster diagnosis time
estimated annual savings from shorter outages
Engineers found problems 30% faster. For companies losing $100K+ per minute, that translates to ~$1.4M in annual savings from shorter outages.
But the real win wasn't the number. It was the confidence. Engineers stopped opening five tabs and guessing. They opened one screen and acted.
"Engineers are rarely slow because they lack data. They're slow because systems fail to surface the right signal first."
I didn't add features. I removed everything that didn't help someone act in the first 30 seconds.