PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Behavioral & Leadership/eBay

Describe services you built and lessons learned

Last updated: Mar 29, 2026

Quick Overview

This question evaluates end-to-end service ownership, system architecture and scalability reasoning, incident response and reliability practices, measurement of technical and business impact, and cross-functional leadership in the Behavioral & Leadership category.

  • medium
  • eBay
  • Behavioral & Leadership
  • Software Engineer

Describe services you built and lessons learned

Company: eBay

Role: Software Engineer

Category: Behavioral & Leadership

Difficulty: medium

Interview Round: Onsite

Walk me through a service you owned end-to-end: the business goal, your role, the architecture, and key decisions. Describe the hardest trade-off you made, a significant incident you handled (what happened, your actions, outcome), how you measured impact, and what you would do differently next time.

Quick Answer: This question evaluates end-to-end service ownership, system architecture and scalability reasoning, incident response and reliability practices, measurement of technical and business impact, and cross-functional leadership in the Behavioral & Leadership category.

Solution

# How to structure a strong answer (5–7 minutes) Use a concise, technical narrative that ties architecture and operations back to business impact. - Situation and Goal: Why this service? What KPI or SLO were you targeting? - Role and Team: Your scope, decisions you owned, collaborators. - Architecture: Request flow, storage, caches, async jobs, SLOs/SLIs, scaling, observability. - Decisions and Trade-offs: Options, data/experiments, chosen path. - Incident: Timeline, diagnosis, mitigation, resolution, and prevention. - Impact: Before/after metrics; experiment or counterfactual; ROI. - Retrospective: What you’d change next time and why. Below is a model example you can adapt. Replace with your own service as needed. ## Model example: Real-time Search Autosuggest Service 1) Business goal - Objective: Increase search engagement and reduce search latency for typed queries. - Baseline: p95 latency ~48 ms, CTR on suggestions 12.5%. - Targets: p95 < 35 ms (99% < 50 ms SLO) and +1.0 pp absolute CTR lift without increasing 5xx error rate above 0.1%. 2) My role and scope - Role: Tech lead and primary backend owner. Team of 4 engineers; partnered with PM, data scientist, and SRE. - Ownership: Architecture, data modeling, performance tuning, rollout strategy, incident response, and postmortems. 3) Architecture overview - Request path: Client → API Gateway → Autosuggest service (gRPC) → In-memory prefix index (FST/trie) with L1 in-process cache and L2 Redis cache. - Data/compute: - Offline signals via Kafka (query logs, click signals) processed in Flink to build suggestion candidates and scores. - Micro-batch updates every 1 minute; index segments stored in object storage; service hot-reloads new segments. - Storage: Read-only compressed index in memory (mmap), metadata in MySQL, L2 cache in Redis. - Scaling: Horizontal pods with consistent hashing by locale; HPA on CPU/QPS; PodDisruptionBudget and maxUnavailable=1 to avoid mass restarts. - Observability: Prometheus/Grafana dashboards, OpenTelemetry traces, synthetic probes, SLOs (99% < 50 ms, 99.9% < 150 ms), error budget burn alerts. 4) Key decisions and rationale - Data structure: FST/trie vs B-tree. Chose FST for compact prefix compression, reducing memory ~35% and improving cache locality. - Protocol: gRPC vs REST. Chose gRPC; reduced serialization overhead (~3–5 ms per call at p95) and standardized IDL. - Freshness approach: Real-time write-through vs 1-minute micro-batch. Chose micro-batch to control tail latency and cost while maintaining acceptable freshness. 5) Hardest trade-off: Freshness vs latency/cost - Option A (real-time updates): - Pros: Near-instant reflection of new queries. - Cons (measured in canary): +23 ms p95 latency; +35% CPU; +$40k/month compute; negligible CTR lift (+0.1 pp). - Option B (1-minute micro-batch): - Pros: Stable p95 latency; good cost profile; simple failure isolation. - Cons: Up to 60 seconds of staleness. - Decision: Option B. Business analysis showed the CTR lift from real-time wasn’t material; tail-latency and cost risk outweighed benefits. 6) Significant incident handled - What happened: During a routine morning index segment roll, several pods restarted simultaneously due to a misconfigured memory limit. This caused an L1 cache cold start and a thundering herd to L2 Redis, saturating network I/O. p99 latency spiked to 1.5 s; 5xx peaked at 2.1% for 11 minutes. - Actions (timeline highlights): - 0–5 min: Froze rollouts, enacted feature flag to allow stale-while-revalidate (serve stale suggestions while warming), elevated L2 TTL. - 5–12 min: Raised autoscaling limits, enabled per-key request coalescing and concurrency limits, diverted 10% traffic to a healthy cell. - 12–17 min: Restored SLOs; applied hotfix to memory limits; staged staggered pod restarts with maxUnavailable=1 and pre-warm. - Outcome: Full recovery in 17 minutes; no data loss; user-visible errors curtailed quickly. - Prevention: Added pod warm-up hooks, circuit breakers to Redis, admission control, chaos drills for cache warm-up, and a runbook with golden signals and one-click mitigation. 7) Measuring impact - Technical before/after (regional average over 2 weeks): - p50 latency: 22 → 14 ms - p95 latency: 48 → 32 ms - 5xx rate: 0.22% → 0.05% - Business A/B test (2-week, holdout; guardrails: p95 < 50 ms, 5xx < 0.1%, cancel if breached): - CTR: 12.5% → 13.8% (absolute +1.3 pp; p < 0.01) - Downstream search-to-order conversion: +0.2 pp (p < 0.05) - Estimating value (example): - additional_clicks = sessions × ΔCTR - additional_orders = additional_clicks × conversion_rate_to_order - incremental_profit = additional_orders × avg_contribution_margin - Example: 5,000,000 sessions/day × 0.013 = 65,000 additional clicks; 4% convert to orders → 2,600 orders; $8 margin → ~$20.8k/day incremental margin. - Validations: CUPED-adjusted analysis to reduce variance; monitored cannibalization of organic clicks; no negative impact on search latency or availability. 8) What I’d do differently - Bake in guardrails early: auto-rollback on error budget burn; red/black deploys with canary analysis (e.g., automated metrics compare) and config typed-schemas. - Better cache warm-up: pre-warm index segments and traffic shadowing before taking pods live. - SLO governance: per-locale SLOs and tighter p99 objectives; run chaos experiments on cache/Redis failure modes quarterly. - Experiment discipline: pre-registered hypotheses, MDE-based sample sizing, and sequential monitoring to shorten test durations safely. ## Pitfalls to avoid - Over-indexing on micro-benchmarks: validate with production-like load and tail latency. - Ignoring cost of tail: p99 and p99.9 often drive user perception and incident risk. - Insufficient blast-radius control: staggered deploys, cell-based routing, and feature flags are essential. ## Plug-and-play template - Business goal: [KPI], baseline [value], target [value]. Why it mattered. - Role/scope: Team size, cross-functional partners, decisions you owned. - Architecture: Client → [Gateway] → [Service] → [Caches/Storage]; async pipelines; SLOs; scaling; observability. - Key decisions: Option A vs B; criteria (latency, cost, complexity, risk, team fit); chosen option + rationale. - Hardest trade-off: Options, quantified impact, decision. - Incident: What, when, root cause, actions, outcome, prevention. - Impact: Technical metrics and business KPIs; experiment design and guardrails; ROI estimate. - Retrospective: What you’d change and why. Use this structure with your own service details to deliver a crisp, leadership-focused narrative that ties engineering rigor to business outcomes.

Related Interview Questions

  • Answer senior behavioral questions - eBay (hard)
  • Why eBay and your weakness? - eBay (medium)
eBay logo
eBay
Sep 6, 2025, 12:00 AM
Software Engineer
Onsite
Behavioral & Leadership
4
0

Behavioral: End-to-End Service Ownership (Onsite)

You are interviewing for a Software Engineer role. Describe one production service you owned end-to-end. Cover the following:

  1. Business goal and problem context
    • What business KPI or user problem motivated the service?
    • Baseline metrics and target (e.g., latency SLOs, conversion/engagement goals).
  2. Your role and scope
    • Team size, your responsibilities, decision ownership, cross-functional partners.
  3. Architecture overview
    • High-level data flow and dependencies (clients, API, compute, storage, streaming, caches).
    • SLOs/SLIs (latency, availability), scaling approach, observability.
  4. Key technical decisions and rationale
    • Alternatives considered, criteria, and why you chose your approach.
  5. Hardest trade-off
    • Options, technical/business criteria, risks, and the decision you made.
  6. Significant incident you handled
    • What happened, timeline, your actions, outcome, and postmortem improvements.
  7. How you measured impact
    • Technical and business metrics, experiment design (A/B, guardrails), and results.
  8. What you would do differently
    • Process, architecture, or org improvements you’d apply with hindsight.

Notes:

  • Keep proprietary details anonymized.
  • Aim for a 5–7 minute, structured walkthrough.

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Behavioral & Leadership•More eBay•More Software Engineer•eBay Software Engineer•eBay Behavioral & Leadership•Software Engineer Behavioral & Leadership
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.