PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates a candidate's ability to implement linear-time deduplication of primitive arrays, understand hash-based membership tracking, and preserve first-occurrence ordering without mutating the input.

  • Okta
  • Coding & Algorithms
  • Frontend Engineer

Deduplicate an Array in Linear Time

Company: Okta

Role: Frontend Engineer

Category: Coding & Algorithms

Interview Round: Technical Screen

Given an array of primitive values such as numbers or strings, implement a function that returns a new array with duplicates removed while preserving the order of each value's first occurrence. Requirements: - Run in O(n) time. - Use O(n) additional space. - Do not mutate the input array. Example: - Input: `[3, 1, 3, 2, 1, 4]` - Output: `[3, 1, 2, 4]` Follow-ups: 1. Reimplement the solution using object-based inline computation, for example by using a plain object as a lookup table instead of a `Set` or `Map`. 2. Explain the memory overhead risks of the object-based approach when processing very large datasets.

Quick Answer: This question evaluates a candidate's ability to implement linear-time deduplication of primitive arrays, understand hash-based membership tracking, and preserve first-occurrence ordering without mutating the input.

Part 1: Deduplicate an Integer Array in Linear Time

Given an integer array arr, return a new array containing each distinct value exactly once, in the order of its first appearance. Your algorithm must run in O(n) time, use O(n) additional space, and must not mutate the input array.

Constraints

  • 0 <= len(arr) <= 200000
  • -10^9 <= arr[i] <= 10^9
  • Do not mutate arr
  • Target time complexity: O(n)

Examples

Input: ([3, 1, 3, 2, 1, 4],)

Expected Output: [3, 1, 2, 4]

Explanation: Keep each number the first time it appears.

Input: ([5, 5, 5, 5],)

Expected Output: [5]

Explanation: All later copies of 5 are duplicates.

Input: ([-1, 0, -1, 2, 0, 3],)

Expected Output: [-1, 0, 2, 3]

Explanation: Negative numbers and zero are handled the same way as positive numbers.

Input: ([],)

Expected Output: []

Explanation: An empty array has no duplicates to remove.

Input: ([7],)

Expected Output: [7]

Explanation: A single-element array is already deduplicated.

Hints

  1. Use a hash-based structure to remember which values have already appeared.
  2. Build a separate result list and append a value only the first time you see it.

Part 2: Deduplicate Using an Object-Style Lookup Table

Given an integer array arr, return a new array with duplicates removed while preserving first-occurrence order. This time, do not use a set. Instead, use an object-style lookup table: store each seen value as a key and a dummy boolean as its value.

Constraints

  • 0 <= len(arr) <= 200000
  • -10^9 <= arr[i] <= 10^9
  • Do not mutate arr
  • Do not use a set in your implementation
  • Target time complexity: O(n)

Examples

Input: ([4, 2, 4, 4, 3, 2, 1],)

Expected Output: [4, 2, 3, 1]

Explanation: Only the first occurrence of each value is kept.

Input: ([0, -1, 0, -1, 2],)

Expected Output: [0, -1, 2]

Explanation: The lookup table should work for zero and negative numbers too.

Input: ([],)

Expected Output: []

Explanation: The empty array stays empty.

Input: ([9],)

Expected Output: [9]

Explanation: A single value is already unique.

Input: ([1, 2, 3, 4],)

Expected Output: [1, 2, 3, 4]

Explanation: If every value is unique, the output matches the input order.

Hints

  1. A dictionary can act like a lookup table even if you only care whether a key exists.
  2. When a value is not already in the lookup table, mark it as seen and append it to the answer.

Part 3: Detect Memory Risk for Object-Based Deduplication

A plain object-based deduplication strategy stores each distinct string as a key. To model the memory overhead risk on very large datasets, use this simplified rule: every distinct string key costs 16 bytes of metadata plus 1 byte per character. Given a list of ASCII strings arr and an integer memory_budget, return True if the estimated extra memory required by the lookup table would exceed memory_budget. Count each distinct string only once.

Constraints

  • 0 <= len(arr) <= 200000
  • 0 <= memory_budget <= 10^12
  • Each arr[i] is an ASCII string with length between 0 and 1000
  • The estimated cost of one distinct string s is 16 + len(s)
  • Target time complexity: O(n + total_characters)

Examples

Input: (['cat', 'dog', 'cat'], 40)

Expected Output: False

Explanation: Distinct strings are 'cat' and 'dog'. Estimated memory is 19 + 19 = 38, which does not exceed 40.

Input: (['apple', 'banana', 'apple', 'pear'], 45)

Expected Output: True

Explanation: Distinct strings are 'apple', 'banana', and 'pear'. Estimated memory is 21 + 22 + 20 = 63, which exceeds 45.

Input: ([], 0)

Expected Output: False

Explanation: No strings means no extra lookup-table memory.

Input: (['a'], 17)

Expected Output: False

Explanation: One distinct string costs exactly 16 + 1 = 17 bytes, which does not exceed the budget.

Input: (['a', 'bb'], 35)

Expected Output: False

Explanation: Estimated memory is 17 + 18 = 35, which matches the budget but does not exceed it.

Hints

  1. You only pay the memory cost once per distinct string, not once per occurrence.
  2. Keep a running total and return early if it already exceeds the budget.
Last updated: Jun 6, 2026

Related Coding Questions

  • Validate IPv4 Address List - Okta (medium)

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.