Critique a product interface and propose fixes
Company: Capital One
Role: Data Scientist
Category: Behavioral & Leadership
Difficulty: medium
Interview Round: Technical Screen
Pick a digital product you use daily. Identify one specific interface you love and one function you dislike. Using usability heuristics (e.g., Nielsen) and accessibility standards (e.g., WCAG), analyze why each succeeds or fails. Propose a redesign for the disliked function, define success metrics and guardrails, and outline an experiment or telemetry plan to validate the change while avoiding novelty effects.
Quick Answer: This question evaluates a candidate's product-thinking and usability analysis skills, specifically application of heuristics and accessibility standards alongside abilities in redesign proposal, success-metrics definition, and experiment/telemetry planning.
Solution
# Example Solution (Product: Gmail on Web)
## 1) Interfaces Chosen
- Loved interface: Smart Compose (inline autocomplete while typing emails).
- Disliked function: Creating filters from the search box (advanced search → Create filter → choose actions).
## 2) Why the loved interface succeeds
- Nielsen heuristics
- Recognition rather than recall: Suggestions appear inline; users don’t recall full phrases.
- Flexibility and efficiency of use: Power users accept with Tab; novices can ignore.
- User control and freedom: Easy to dismiss; nothing is committed until accepted.
- Aesthetic and minimalist design: Subtle, unobtrusive gray suggestions.
- Visibility of system status: Immediate feedback as you type.
- Accessibility (WCAG)
- 2.1.1 Keyboard: Accept/dismiss via keyboard (Tab/Esc).
- 4.1.3 Status Messages: Should announce suggestion appearance non-disruptively to screen readers (aria-live="polite").
- 1.4.1 Use of Color: Suggestion shouldn’t rely on color alone; the style change is distinct.
- Potential gaps: 1.4.3 Contrast (Minimum) if suggestion gray is too low for some users; ensure focus/accept affordances are perceivable.
## 3) Why the disliked function fails
- Problem summary: The path to create a filter is hidden behind a small icon; advanced syntax requires recall; high risk of over-broad rules; limited preview.
- Nielsen heuristics
- Recognition rather than recall (violated): Users must remember operators (from:, has:attachment, older_than:).
- Visibility of system status (weak): Limited live preview or impact estimate (e.g., how many emails will match?).
- Error prevention (weak): Easy to create broad filters that auto-archive too much.
- Match with real world: Terminology/actions are system-centric ("skip inbox") vs plain-language outcomes.
- Help and documentation: Hints exist but are not integrated contextually.
- Accessibility (WCAG)
- 2.5.5 Target Size (2.2): Small click targets (search options icon) can be hard to hit.
- 2.1.1 Keyboard: All steps must be fully operable via keyboard; focus order can be non-obvious.
- 2.4.7 Focus Visible: Focus rings may be subtle in dense forms.
- 3.3.1 Error Identification / 3.3.3 Error Suggestion: Poor feedback for invalid/over-broad queries.
- 1.3.1 Info and Relationships / 4.1.2 Name, Role, Value: Ensure proper labels/roles for form fields and toggle actions.
## 4) Redesign proposal for filter creation
- Design goals: Make the task discoverable, previewable, safe-by-default, and fully accessible.
- Core changes
1) Tokenized query builder in the search bar
- As you type "from: ali", show selectable chips: From, To, Subject, Has attachment, List-id, etc.
- Replace free-form recall with guided, labeled tokens (autocomplete + examples).
2) Live preview panel
- Show match count and top 3 example emails with highlighted matched fields.
- Impact banner: "This rule would affect ~2% of your new mail (about 15/day)."
3) Safe action wizard
- Step 1: Define conditions; Step 2: Choose actions in plain language ("Apply label Receipts", "Skip the inbox"), each with explanations.
- Default to Test Mode for 7 days (labels only, no destructive actions). Clear toggle to activate full actions later.
4) Risk guardrails
- Warnings when conditions are broad: "Over 1,000 emails match. Consider adding Subject or Sender."
- Confirmation with summary and undo link.
5) Accessibility baked in
- Keyboard-first flow; logical tab order; visible focus states.
- Proper labels and instructions (3.3.2), clear error text (3.3.1/3.3.3), status updates via aria-live (4.1.3).
- Sufficient contrast (1.4.3) and adequate target sizes (2.5.5 where applicable).
## 5) Success metrics and guardrails
- Primary metrics
- Filter Creation Success Rate: Completed filters / filter flows started.
- Time to Create: Median seconds from filter flow start to confirmation.
- Misfilter Rate: (Messages automatically acted on that are subsequently undone or moved back to Inbox within 24h) / (Messages auto-acted on by filters).
- Secondary metrics
- Adoption: % of active users creating their first filter in 30 days.
- Precision proxy: % of filter edits within 72h that tighten scope (adding conditions).
- Preview Utilization: % using preview or suggestions before creating.
- Guardrails
- Misfilter Rate must not increase by >0.5 per 1,000 auto-acted messages.
- No regressions in page performance (e.g., p95 search-to-first-paint +50ms max).
- Accessibility: Automated checks pass (e.g., axe/pa11y); manual keyboard/screen-reader smoke tests pass required WCAG 2.1 AA.
- User support: No significant increase in help tickets related to filters per MAU.
Small numeric examples
- Baseline success 30%; target +5pp to 35% and median time from 45s to 30s.
- Baseline misfilter 2/1,000; guardrail cap at 2.5/1,000.
## 6) Experiment and telemetry plan
- Experiment design
- Randomization unit: User-level A/B.
- Segmentation/stratification: Prior filter users vs. first-time; low/medium/high email volume; desktop vs. mobile web.
- Ramp plan: 5% → 20% → 50% → 100% with automated guardrail checks at each step; maintain a 10% long-term holdout for several weeks.
- Duration: 4–6 weeks to allow Test Mode filters to act on enough new mail and to reduce novelty effects.
- Novelty-effect mitigation
- Learning period: Exclude first 2–3 days from primary analysis or report stabilized metrics separately.
- Repeat-exposure metrics: Evaluate outcomes after users have created at least one filter and had 7 days of filter activity.
- Difference-in-differences: Compare pre/post per-user changes in both arms to control for seasonality.
- CUPED/covariate adjustment: Use prior search usage and email volume to reduce variance.
- Telemetry/events to log
- search_opened, filter_builder_opened, token_added/removed, suggestion_clicked, preview_viewed, risk_warning_shown, test_mode_enabled, filter_created, filter_edited, filter_disabled.
- auto_action_applied (label/archive), undo_invoked, message_moved_to_inbox_after_auto_action.
- Timers: start/end timestamps for time-on-task; counts of reformulations.
- Power check (rough)
- If baseline success = 30% and target = 35% (Δ=5pp), two-proportion test with α=0.05, power=0.8 needs roughly ~3,000 filter-start sessions per arm.
- Analysis
- Primary: Intent-to-treat averages; report medians for time-to-create.
- Sensitivity: Per-segment results; check for Simpson’s paradox across volume cohorts.
- Quality: Inspect misfilter examples via aggregate signals (no content inspection; rely on undo/move-back events only).
## 7) Risks and mitigations
- Risk: Users create over-broad rules quickly.
- Mitigation: Default Test Mode + prominent impact preview + warnings.
- Risk: Added UI increases cognitive load.
- Mitigation: Progressive disclosure; defaults; concise copy; keyboard shortcuts.
- Risk: Accessibility regressions.
- Mitigation: WCAG 2.1 AA checklist in CI; manual SR/keyboard QA on key flows.
This approach ties product sense (heuristics, accessibility) to a measurable redesign with a robust experiment plan that accounts for novelty and protects user experience with clear guardrails.