PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates understanding of string data structures, memory layout, copy complexity, move semantics, and language-specific behavior such as Rust's move model.

  • Medium
  • Character.AI
  • Coding & Algorithms
  • Software Engineer

Explain strings and copy complexity

Company: Character.AI

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

##### Question What is a string? What fields does a struct string contain, and how would you implement it yourself? What is the time complexity of copying a string? How can move operations be made more efficient? In Rust, is move just modifying a reference?

Quick Answer: This question evaluates understanding of string data structures, memory layout, copy complexity, move semantics, and language-specific behavior such as Rust's move model.

You are given a sequence of commands that operate on named string variables using a toy string library. Each variable may refer to a heap-allocated buffer that stores its content. You must simulate three kinds of data ownership operations—deep copy (copy), move (move), and copy-on-write clone (cowcopy)—plus creation (new) and append (append). Track and return outputs for query commands. Cost is defined as the total number of bytes physically copied due to allocations or mutations. Rules: 1) new v s: allocate a new buffer with contents s and bind it to v; cost increases by len(s). 2) copy a b: deep copy contents of a into a new buffer and bind to b; cost increases by len(a). If a is unbound, b becomes unbound and no cost. 3) cowcopy a b: bind b to the same buffer as a (increase reference count); no immediate cost. If a is unbound, b becomes unbound. 4) move a b: transfer binding from a to b in O(1); a becomes unbound; no cost. If a is unbound, b becomes unbound. 5) append v t: append string t to v. If v is unbound, treat as new v t. If v's buffer has refcount > 1, detach (copy-on-write): allocate new buffer with old contents plus t; cost increases by old_len + len(t). If refcount == 1, grow in place; cost increases by len(t). 6) len v: output the current length of v (0 if unbound). 7) cost: output the cumulative cost so far. Rebinding a variable first releases its previous buffer reference; buffers are freed when their reference count reaches 0. Return a list of integers corresponding to outputs of len and cost commands in order.

Constraints

  • 1 <= len(ops) <= 100000
  • Each command is one of: new v s | copy a b | cowcopy a b | move a b | append v t | len v | cost
  • Variable names v, a, b: [A-Za-z0-9_]{1,20}
  • Strings s, t: lowercase ASCII letters only, 1 <= len(s), len(t) <= 100000
  • Sum of lengths of all provided s and t over ops <= 1000000
  • All operations must be processed in order; outputs are only for len and cost commands
  • Cumulative cost fits in 64-bit signed integer

Solution

from typing import List

def simulate_string_ops(ops: List[str]) -> List[int]:
    buffers: dict[int, list[int]] = {}  # id -> [length, refcount]
    var2buf: dict[str, int | None] = {}
    next_id = 1
    total_cost = 0
    outputs: List[int] = []

    def alloc(length: int) -> int:
        nonlocal next_id
        bid = next_id
        next_id += 1
        buffers[bid] = [length, 0]
        return bid

    def rebind(var: str, new_id: int | None) -> None:
        old_id = var2buf.get(var)
        if old_id == new_id:
            # Ensure var is recorded
            var2buf[var] = new_id
            return
        if old_id is not None:
            if old_id in buffers:
                buffers[old_id][1] -= 1
                if buffers[old_id][1] == 0:
                    del buffers[old_id]
        var2buf[var] = new_id
        if new_id is not None:
            buffers[new_id][1] += 1

    for line in ops:
        parts = line.split()
        cmd = parts[0]

        if cmd == "new":
            v, s = parts[1], parts[2]
            bid = alloc(len(s))
            total_cost += len(s)
            rebind(v, bid)

        elif cmd == "copy":
            a, b = parts[1], parts[2]
            src_id = var2buf.get(a)
            if src_id is None:
                rebind(b, None)
            else:
                length = buffers[src_id][0]
                nb = alloc(length)
                total_cost += length
                rebind(b, nb)

        elif cmd == "cowcopy":
            a, b = parts[1], parts[2]
            src_id = var2buf.get(a)
            rebind(b, src_id if src_id is not None else None)

        elif cmd == "move":
            a, b = parts[1], parts[2]
            if a == b:
                continue
            src_id = var2buf.get(a)
            if src_id is None:
                rebind(b, None)
                # ensure a is present and unbound
                var2buf[a] = None
            else:
                rebind(b, src_id)
                rebind(a, None)

        elif cmd == "append":
            v, t = parts[1], parts[2]
            bid = var2buf.get(v)
            add_len = len(t)
            if bid is None:
                nb = alloc(add_len)
                total_cost += add_len
                rebind(v, nb)
            else:
                length, rc = buffers[bid]
                if rc == 1:
                    buffers[bid][0] = length + add_len
                    total_cost += add_len
                else:
                    nb = alloc(length + add_len)
                    total_cost += length + add_len
                    rebind(v, nb)

        elif cmd == "len":
            v = parts[1]
            bid = var2buf.get(v)
            outputs.append(0 if bid is None else buffers[bid][0])

        elif cmd == "cost":
            outputs.append(total_cost)

        else:
            # Invalid command per constraints will not occur
            pass

    return outputs
Explanation
Maintain a mapping from variable names to buffer IDs and a table of buffers with (length, refcount). new and deep copy allocate new buffers and increase cost by the number of bytes created. cowcopy shares the buffer by increasing its refcount without copying. move transfers a binding in O(1) by reassigning the variable; it decrements the destination's previous buffer and unbinds the source, leaving the moved buffer's refcount unchanged overall. append either grows in place when refcount == 1 (cost equals appended length) or performs copy-on-write: allocate a new buffer containing old + appended bytes (cost equals old_len + appended length) and rebind the variable to it. len and cost produce outputs. This directly models pointer-length-refcount behavior and the cost of copying bytes.

Time complexity: O(Q + B) where Q is the number of operations and B is the total number of bytes physically copied due to new/copy/detach/append. Space complexity: O(V + U) where V is the number of live variables and U is the total size of currently allocated unique buffers.

Hints

  1. Track buffers by IDs with fields: length and reference count.
  2. Map variable names to buffer IDs. Rebinding should decrement the old buffer's refcount and free it if it becomes zero.
  3. Deep copy allocates a new buffer; cowcopy just shares the buffer (increment refcount).
  4. On append, if refcount > 1, detach (copy old content) before mutating.
Last updated: Mar 29, 2026

Related Coding Questions

  • Design timestamped key-value map - Character.AI (Medium)

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.