Describe times exceeding and missing expectations
Company: Amazon
Role: Software Engineer
Category: Behavioral & Leadership
Difficulty: medium
Interview Round: Technical Screen
Describe a time you exceeded expectations. Then, describe a time you fell below expectations. For each example, explain the context, your specific actions, stakeholders involved, measurable results (with numbers if possible), feedback you received, root causes, lessons learned, and what you would do differently next time. Include how you followed up afterward to solidify improvements.
Quick Answer: This question evaluates behavioral and leadership competencies in a Software Engineer role—focusing on ownership, self-awareness, stakeholder communication, root-cause analysis, and the ability to quantify impact when describing times of exceeding and missing expectations.
Solution
## How to Approach
Use STAR+RLF: Situation, Task, Actions, Results, plus Lessons, Feedback, and Follow-up. Quantify impact (latency, error rate, throughput, cost, developer velocity, on-call pages). Call out stakeholders (PM, SRE, QA, data/analytics, customer). Close with concrete follow-ups (dashboards, SOPs, tests, design changes).
Quick metric ideas for SWE examples:
- Performance: P50/P90 latency, CPU/memory, cache hit rate
- Reliability: error rate, availability (nines), MTTR/MTTD, on-call pages/week
- Delivery: scope, timeline, story points, PR cycle time, deployments/week
- Cost: compute/storage $, request cost, data transfer
---
## Example 1 — Exceeded Expectations: Cut API Latency and Cost Ahead of Schedule
- Situation: Our checkout service’s P90 latency was 480 ms (SLO ≤ 300 ms). A seasonal traffic spike was due in 6 weeks. Target: reach ≤ 300 ms P90.
- Task: Lead performance improvements while maintaining feature velocity and not increasing infrastructure cost.
- Actions:
1. Instrumented distributed tracing (OpenTelemetry) to profile hotspots across the request path.
2. Identified a N+1 query in the promotions lookup and a low cache hit rate (61%).
3. Implemented a read-through Redis cache with versioned keys; added a 2-tier cache (local LRU + Redis) and request coalescing to avoid stampedes.
4. Rewrote the promotions query with proper indexing and batched fetches; added async pre-warming during deployment.
5. Partnered with SRE to run canary releases (10% traffic), added SLO-based auto-rollback, and set alerts on latency/error budgets.
6. Coordinated with QA for load tests at 1.5× projected peak using production traffic replays.
- Stakeholders: PM (timeline/priorities), SRE (rollouts/monitoring), QA (test plan), Data/Analytics (traffic projections), Support (customer comms).
- Results (measured):
- P90 latency: 480 ms → 190 ms (−60%) and P50: 180 ms → 95 ms (−47%).
- Cache hit rate: 61% → 92%.
- Infra cost: −18% per request (fewer DB calls); throughput +25% under same nodes.
- Delivered 2 weeks ahead of the 6-week deadline; zero SLO violations during peak.
- Feedback: Manager highlighted proactive tracing/observability; PM praised early delivery and low risk; SRE commended safe rollout and alert quality.
- Root Causes: Lack of end-to-end visibility masked N+1 query; no local cache caused redundant DB hits; missing pre-warm led to cold-start penalties.
- Lessons Learned: Invest early in observability; combine incremental wins (DB + caching + rollout safety); validate with realistic load.
- What I’d Do Differently: Involve DBAs earlier for index planning; add automated regression checks for cache key versioning in CI.
- Follow-up: Documented a performance playbook; created Grafana dashboards and SLO/error-budget alerts; added a CI load-test smoke (k6) for top endpoints; ran a guild talk so other teams replicated the pattern.
---
## Example 2 — Fell Below Expectations: Feature Flag Mishap Caused a Partial Outage
- Situation: I owned a new recommendations widget. We planned a gradual rollout via feature flags. I aimed to roll out to 25% of traffic with zero impact on error rates.
- Task: Implement the feature, design the rollout, and ensure safe fallback.
- Actions:
1. Shipped the widget behind a flag; added client-side telemetry and server logs.
2. Assumed default-off for all services but missed that one edge service interpreted a null flag as true.
3. Began a 10% canary. Within minutes, 5XX errors rose from 0.2% → 2.6% on a specific path; I paused rollout and initiated incident response.
4. Hotfixed the flag evaluation to default to false and added a kill switch at the CDN layer.
5. Ran a quick rollback and purged stale CDN cache that held the incorrect default.
- Stakeholders: On-call SRE (incident commander), PM (customer impact/comm), Frontend team (CDN rules), Support (status page), QA (post-fix validation).
- Results (measured):
- Impact window: 37 minutes; elevated error rate 2.6% at peak; ~11,400 failed requests; MTTR 24 minutes after detection.
- Customer tickets: +38 during window; no data integrity issues.
- Feedback: Manager appreciated rapid containment and clear comms, but flagged insufficient pre-checks and assumptions about defaults across services.
- Root Causes:
- Technical: Inconsistent flag default semantics across services; missing kill-switch; CDN cached stale config.
- Process: Canary plan didn’t include synthetic tests for flag-off semantics; no contract test for default values; alerting threshold for 5XX was too high to detect earlier.
- Lessons Learned: Treat feature flags as critical infrastructure; standardize default semantics; include synthetic traffic and contract tests; place kill switches at multiple layers.
- What I’d Do Differently: Add a preflight checklist (flag defaults, kill switch verified, canary + synthetic tests, CDN cache TTLs); ratchet alerts to detect 3× baseline spikes within 2 minutes; stage an explicit rollback drill.
- Follow-up: Authored a post-incident review with 5-Whys; introduced a shared feature-flag SDK with explicit defaults and typed configs; added contract tests in CI; created a canary runbook and a CDN kill-switch SOP; instrumented dashboards with SLO-based error budget alerts; trained the team in a 30-minute tabletop exercise.
---
## Validation and Guardrails You Can Mention in Interviews
- Pre-deploy: Load tests with production traffic replays; contract tests for cross-service assumptions; feature-flag checklists.
- Deploy: Small-step canaries, automatic rollback on SLO breach, fast kill switches.
- Post-deploy: Dashboards with P50/P90/P99, error budgets, anomaly detection; blameless reviews with concrete action items and owners.
These two stories demonstrate ownership, bias for action, deliver results, insist on highest standards, and earn trust through transparent communication and durable follow-ups.