PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/ML System Design/Apple

Design an ML keyword recommendation pipeline

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's ability to design an end-to-end, production-scale ML keyword recommendation pipeline, including retrieval and ranking architectures, taxonomy-grounded typed suggestions, data sourcing and labeling, operational concerns like latency, scalability, privacy, and policy compliance.

  • hard
  • Apple
  • ML System Design
  • Machine Learning Engineer

Design an ML keyword recommendation pipeline

Company: Apple

Role: Machine Learning Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Technical Screen

Design an ML pipeline that generates search keyword recommendations for an app marketplace. Given a query like "games," produce diverse, typed suggestions (e.g., genres such as puzzle, RPG, racing) with high relevance and coverage. Specify objectives and constraints (relevance, diversity, freshness, latency, privacy). Detail data sources (query/search logs, clicks, installs, uninstalls, app metadata and taxonomy, reviews, co-search/co-click graphs, embeddings, locale signals) and labeling/feedback strategies. Propose the system architecture: candidate generation and ranking stages, feature store, offline training, online serving, cache, and retrieval. Describe features (text/semantic embeddings, popularity/recency, user/context signals, co-occurrence, graph features, quality/spam signals). Compare model options (BM25/ANN retrieval, two-tower retrieval, gradient-boosted trees, pairwise/listwise rankers, sequence models, graph models) and justify choices. Define evaluation metrics and experimentation (CTR, install rate, coverage, diversity, precision/recall, latency/errors; A/B testing and guardrails). Explain online/continual training after launch (streaming feedback ingestion, feature freshness, update cadence, warm-starting, drift detection, rollback). Discuss handling cold start, multilingual/locale variants, spam/abuse, and fairness.

Quick Answer: This question evaluates a candidate's ability to design an end-to-end, production-scale ML keyword recommendation pipeline, including retrieval and ranking architectures, taxonomy-grounded typed suggestions, data sourcing and labeling, operational concerns like latency, scalability, privacy, and policy compliance.

Related Interview Questions

  • Design a CPA system for ad bidding - Apple (medium)
  • Optimize image filters on device - Apple (medium)
  • Design a news feed ranking system - Apple (medium)
  • Design a grounded voice assistant - Apple (medium)
  • Design a streaming embedding-based classifier - Apple (hard)
Apple logo
Apple
Jul 26, 2025, 12:00 AM
Machine Learning Engineer
Technical Screen
ML System Design
3
0

ML System Design: Typed Search Keyword Recommendations for an App Marketplace

Goal

Design an end-to-end ML pipeline that, given a user query (e.g., "games"), generates diverse, typed keyword suggestions (e.g., "puzzle games", "RPG games", "racing games") with high relevance and coverage.

Assume you are designing for a large-scale app marketplace with millions of users and tens of thousands of queries per second during peak. Typed suggestions are grounded in a controlled taxonomy (e.g., Genre, Feature, Price, Age, Mode) and must be compliant with marketplace policies.

Requirements

  1. Objectives and constraints
  • Relevance, diversity, coverage
  • Freshness/trends, multilingual/locale correctness
  • Latency and availability SLOs
  • Privacy and policy compliance
  1. Data sources and labeling/feedback
  • Query/search logs, clicks, installs, uninstalls
  • App metadata and taxonomy
  • Reviews text, co-search/co-click graphs
  • Embeddings, locale signals
  • Labeling: implicit feedback (CTR/installs), counterfactual debiasing, editorial seeds
  1. System architecture
  • Candidate generation: lexical, semantic (ANN), taxonomy, graph/co-occurrence, trending
  • Ranking: multi-stage (LTR + neural), diversity-aware re-rank
  • Feature store (offline/online), offline training, online serving, cache, retrieval indices
  1. Features
  • Text/semantic embeddings, lexical features
  • Popularity/recency/trending signals
  • User/context signals (locale, device)
  • Co-occurrence/graph features (PMI, P(s|q))
  • Quality/spam trust signals
  1. Models and choices
  • Retrieval: BM25, two-tower ANN, graph-based expansion
  • Ranking: GBDT, pairwise/listwise LTR, cross-encoder re-ranker, optional sequence/graph models
  1. Evaluation and experimentation
  • Metrics: CTR, install rate, NDCG, recall@K, coverage/diversity, latency/errors, calibration
  • A/B testing with guardrails and statistical rigor
  1. Continual training/ops
  • Streaming feedback ingestion, feature freshness, update cadence
  • Warm-starting, drift detection, rollback
  1. Special cases
  • Cold start, multilingual/locale variants, spam/abuse, fairness and policy

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Apple•More Machine Learning Engineer•Apple Machine Learning Engineer•Apple ML System Design•Machine Learning Engineer ML System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.