Explain gap and project contributions
Company: TikTok
Role: Software Engineer
Category: Behavioral & Leadership
Difficulty: medium
Interview Round: Technical Screen
Walk me through what you've been doing over the past few months since leaving your last role. Are you currently employed full-time? If not, why? Describe your most recent project: its scope, your specific responsibilities, key technical decisions, and measurable results. Clarify what you personally delivered versus what the team handled. What did you learn and what would you do differently next time?
Quick Answer: This question evaluates a candidate's ability to explain employment gaps and provide a project deep-dive demonstrating ownership, technical decision-making, measurable impact, and clear delineation of personal contributions, testing communication, accountability, and system design competencies.
Solution
Below is a structured way to answer, plus a sample response you can adapt.
A. How to structure your answer
1) Timeline opener (20–40 seconds)
- Employed: state team, scope, and the most relevant ongoing project.
- Not employed: one-line reason + what you’ve been doing productively (contracting/freelance, open-source, certificates, interviews, travel/family care if applicable). Show momentum and intent.
2) Project deep-dive using STAR + Metrics + I/We framing
- Situation: 1–2 lines on the problem and constraints.
- Task: your explicit ownership and success criteria.
- Actions: 3–5 high-leverage actions/decisions, each with rationale and trade-offs.
- Results: 3–4 measurable outcomes; include baselines and % change when possible.
- Lessons/Next time: 1–2 concrete improvements; show systems thinking.
- I vs. We: use “I” for what you owned; “we” for team outcomes.
3) Quantify whenever possible
- Latency: p95/p99 (e.g., reduced p95 from 450 ms to 180 ms).
- Throughput/QPS: (e.g., sustained 25k RPS with <1% error rate).
- Reliability: SLO/SLA, error budgets, availability (e.g., 99.95%).
- Cost: infra spend, per-request cost, storage/compute savings.
- Engagement/conversion: A/B delta, retention lift, click-through rate.
4) Guardrails and validation
- Mention A/B tests, canaries, on-call metrics, dashboards, incident reviews.
- Note trade-offs: consistency vs. availability, latency vs. cost, batch vs. streaming.
B. Sample answer (tailor to your own experience)
1) Recent months
- After leaving my last role in May, I took a short break to recharge and then focused full-time on two areas: (a) a contract to refactor a low-latency feed service and (b) upskilling on distributed systems. I completed a hands-on course in performance tuning for gRPC services and contributed to an open-source metrics exporter. I’m now actively interviewing and open to full-time roles.
2) Most recent project: Low-latency feed service refactor
- Situation: The feed service struggled during traffic spikes: p95 latency ~450 ms, occasional 503s at ~12k RPS, and expensive cache misses. We targeted a <200 ms p95 at 20k RPS with a 99.95% availability SLO.
- Task (my role): I owned end-to-end request path optimization and cache strategy: API contracts, read path profiling, cache layer redesign, and rollout safety. Success = p95 <200 ms, error rate <0.5%, and 20% cost reduction.
- Actions:
1) Profiling: instrumented critical paths with tracing. Found N+1 calls to the user-graph service and inefficient JSON marshalling hot spots.
2) API and batching: designed a batched gRPC endpoint for user-graph lookups; added protobuf schema for lighter payloads. Trade-off: more complex client-side batching logic for lower latency.
3) Caching strategy: introduced a two-tier cache (local LRU + Redis) with request coalescing and TTLs tuned by content churn. Added negative caching to suppress repeated 404s.
4) Data access: replaced synchronous fan-out with a bounded parallel fetch and circuit breakers; configured timeouts and fallbacks to cached partial results.
5) Rollout safety: added feature flags, per-endpoint canaries, and SLO-based auto-rollback. Built Grafana dashboards for p95/p99, saturation, and error budgets.
- Results:
- Latency: p95 dropped from 450 ms to 180 ms (60% improvement); p99 from 900 ms to 320 ms.
- Throughput/reliability: sustained 25k RPS with 0.35% error rate; availability improved to 99.96% over 30 days.
- Cost: Redis hit rate increased from 62% to 88%, cutting egress and DB reads; infra cost per 1k requests down ~28%.
- User impact: A/B test showed +3.1% session length and -12% feed timeout errors.
- Ownership clarity:
- I personally: designed the batched gRPC contract and client library; implemented the two-tier cache and request coalescing; added tracing, dashboards, and SLO-driven rollout automation; led the canary and on-call playbook updates.
- Team: platform team provisioned Redis cluster upgrades; another engineer optimized DB indices; PM and data science ran the A/B experiment and analysis.
- Lessons and what I’d do differently:
1) Start with structured load modeling earlier. Our first load test under-modeled burstiness; I’d adopt a heavier-tailed traffic generator from day 1.
2) Error budget policy upfront. We added it mid-project; earlier adoption would have sped decisions on timeouts vs. retries.
3) Formalize backpressure at the client earlier to prevent queue buildup under partial outages.
C. Templates you can reuse
1) Gap/employment template
- Employed: “I’m currently a [role] working on [team/scope]. Most recently I’ve been focused on [project] where I [your ownership] and achieved [metric].”
- Not employed: “I wrapped my last role in [month]. Since then I’ve [1–2 productive activities]. I’ve completed [course/cert/project] and contributed to [open-source/contract]. I’m now focused on finding a full-time role aligned with [your focus].”
2) Project deep-dive template
- Situation: “We needed to [goal] for [users], constrained by [latency/scale/cost/compliance]. Baseline was [metric]. Target was [metric].”
- Task/Ownership: “I owned [areas] with success defined as [KPIs].”
- Actions: “I [action 1 + rationale/trade-off], [action 2], [action 3]…”
- Results: “We achieved [metric deltas], validated via [A/B/canary/benchmark]. Secondary outcomes: [cost/reliability].”
- I vs. We: “I delivered [X, Y, Z]; the team handled [A, B].”
- Lessons/Next time: “Next time, I’d [improvement], because [reason].”
D. Common pitfalls to avoid
- Vague outcomes: avoid “faster” or “better” without baselines and deltas.
- Ownership blur: be explicit about what you owned.
- Tech name-drops without trade-offs: always state why you chose X over Y.
- Missing validation: mention tests, canaries, or A/B to show rigor.
- Over-indexing on failure or blame: focus on learning and system improvements.
E. Quick metric examples if you can’t share absolute numbers
- Latency: “p95 improved ~60%”
- Reliability: “availability from ~99.9% to ~99.95%”
- Cost: “compute spend reduced ~25%”
- Engagement: “session length +3%”
Use the structure above to deliver a concise 2–3 minute narrative, then go deeper if the interviewer probes specific design or implementation details.