PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/TikTok

Find linked user records by weighted similarity

Last updated: Mar 29, 2026

Quick Overview

This question evaluates skills in similarity-based record linkage, weighted field scoring, and graph connectivity analysis within the Coding & Algorithms domain, examining competency in designing scalable matching strategies, thresholded similarity, and handling direct and indirect links between records.

  • medium
  • TikTok
  • Coding & Algorithms
  • Software Engineer

Find linked user records by weighted similarity

Company: TikTok

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

You are given a list of user records. Each record has fields: - `id` (unique) - `name` - `email` - `company` You are also given: - `weights`: a map from field name to weight (e.g., `name: 0.2, email: 0.5, company: 0.3`) - `threshold`: a float - `target_user_id` ## Similarity scoring Define `similarity(recordA, recordB)` as the sum over fields of: - `weights[field] * field_similarity(field_valueA, field_valueB)` where `field_similarity` returns a value in `[0,1]` (the exact function is provided/assumed in the interview; for example, exact match => 1, otherwise 0; or a string similarity). Two records are considered **linked** if their total similarity score is `>= threshold`. ## Task Return all record IDs that should be considered the same user as `target_user_id`. ### Follow-up 1: include 1-hop indirect links Include not only records directly linked to the target, but also records linked to those direct matches (i.e., within 2 steps from the target), even if they are not directly linked to the target. ### Follow-up 2: include all indirect links (connected component) Return all record IDs in the entire connected component containing `target_user_id`, where edges connect pairs of records whose similarity is `>= threshold`. ## Notes - Clarify whether the output includes the target ID itself. - Aim for an approach that avoids unnecessary pairwise comparisons when possible (discuss indexing/blocking if relevant).

Quick Answer: This question evaluates skills in similarity-based record linkage, weighted field scoring, and graph connectivity analysis within the Coding & Algorithms domain, examining competency in designing scalable matching strategies, thresholded similarity, and handling direct and indirect links between records.

Related Interview Questions

  • Parse a nested list from a string - TikTok (medium)
  • Implement stacks, streaming median, and upward path sum - TikTok (easy)
  • Solve common string/DP/stack problems - TikTok (medium)
  • Implement stack variants and path-sum check - TikTok (medium)
  • Maximize sum with no adjacent elements - TikTok (medium)
TikTok logo
TikTok
Jan 6, 2026, 12:00 AM
Software Engineer
Technical Screen
Coding & Algorithms
3
0
Loading...

You are given a list of user records. Each record has fields:

  • id (unique)
  • name
  • email
  • company

You are also given:

  • weights : a map from field name to weight (e.g., name: 0.2, email: 0.5, company: 0.3 )
  • threshold : a float
  • target_user_id

Similarity scoring

Define similarity(recordA, recordB) as the sum over fields of:

  • weights[field] * field_similarity(field_valueA, field_valueB)

where field_similarity returns a value in [0,1] (the exact function is provided/assumed in the interview; for example, exact match => 1, otherwise 0; or a string similarity).

Two records are considered linked if their total similarity score is >= threshold.

Task

Return all record IDs that should be considered the same user as target_user_id.

Follow-up 1: include 1-hop indirect links

Include not only records directly linked to the target, but also records linked to those direct matches (i.e., within 2 steps from the target), even if they are not directly linked to the target.

Follow-up 2: include all indirect links (connected component)

Return all record IDs in the entire connected component containing target_user_id, where edges connect pairs of records whose similarity is >= threshold.

Notes

  • Clarify whether the output includes the target ID itself.
  • Aim for an approach that avoids unnecessary pairwise comparisons when possible (discuss indexing/blocking if relevant).

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More TikTok•More Software Engineer•TikTok Software Engineer•TikTok Coding & Algorithms•Software Engineer Coding & Algorithms
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.