PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates a candidate's ability to design efficient data structures and implement algorithms for sparse matrix storage and operations, emphasizing space-efficient representations, matrix arithmetic, and computational complexity reasoning.

  • Medium
  • Pinterest
  • Coding & Algorithms
  • Data Scientist

Design Data Structure for Sparse Matrices Operations

Company: Pinterest

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Onsite

##### Scenario Analytics engine stores extremely sparse numeric matrices. ##### Question Design a data structure to store two sparse matrices and implement print(), add(A,B) and multiply(A,B). Discuss complexity for each operation. ##### Hints Use dictionary-of-keys or CSR; pre-index rows and columns to speed multiplication.

Quick Answer: This question evaluates a candidate's ability to design efficient data structures and implement algorithms for sparse matrix storage and operations, emphasizing space-efficient representations, matrix arithmetic, and computational complexity reasoning.

You are given two sparse integer matrices A and B. Each matrix is specified by its dimensions and a list of nonzero entries as [row, col, val] with 0-based indices. Multiple entries for the same (row, col) may appear and must be summed; any resulting zero is removed. Implement a function that stores both matrices in a dictionary-of-keys structure and supports: (1) add: return the canonical sparse triplet list of A + B (only if dimensions match), (2) multiply: return the canonical sparse triplet list of A × B (only if A.cols == B.rows), and (3) printA / printB: return a newline-separated string of canonical triplets "i j v" for A or B. Canonical triplet order is sorted by row, then column ascending.

Constraints

  • 1 <= dimA[0], dimA[1], dimB[0], dimB[1] <= 100000
  • 0 <= len(A_entries) + len(B_entries) <= 200000
  • Entries use 0-based indices: 0 <= i < rows, 0 <= j < cols
  • Values are 32-bit signed integers; duplicates at the same (i,j) are summed; zeros are omitted
  • For op == 'add': dimA == dimB
  • For op == 'multiply': dimA[1] == dimB[0]
  • Output for add/multiply: list of [i,j,val] sorted by i, then j
  • Output for printA/printB: newline-separated 'i j v' lines in sorted order
  • Do not mutate inputs; raise ValueError on invalid indices or incompatible dimensions

Solution

def sparse_matrix_ops(dimA, dimB, A_entries, B_entries, op):
    def as_pair(x):
        if isinstance(x, (list, tuple)) and len(x) == 2:
            return int(x[0]), int(x[1])
        raise ValueError("dim must be a length-2 list/tuple")

    ra, ca = as_pair(dimA)
    rb, cb = as_pair(dimB)

    def normalize(entries, r, c):
        rows = {}
        if entries is None:
            return rows
        for e in entries:
            if not (isinstance(e, (list, tuple)) and len(e) == 3):
                raise ValueError("each entry must be [i,j,val]")
            i, j, v = e
            i = int(i); j = int(j); v = int(v)
            if not (0 <= i < r and 0 <= j < c):
                raise ValueError("index out of bounds")
            if v == 0:
                continue
            row = rows.get(i)
            if row is None:
                row = {}
                rows[i] = row
            row[j] = row.get(j, 0) + v
            if row[j] == 0:
                del row[j]
        return rows

    Arows = normalize(A_entries, ra, ca)
    Brows = normalize(B_entries, rb, cb)

    def to_triplets(rows):
        trip = []
        for i in sorted(rows):
            row = rows[i]
            for j in sorted(row):
                v = row[j]
                if v != 0:
                    trip.append([i, j, v])
        return trip

    if op == "printA":
        trip = to_triplets(Arows)
        return "\n".join(f"{i} {j} {v}" for i, j, v in trip)
    if op == "printB":
        trip = to_triplets(Brows)
        return "\n".join(f"{i} {j} {v}" for i, j, v in trip)

    if op == "add":
        if ra != rb or ca != cb:
            raise ValueError("dimension mismatch for add")
        res = {}
        def add_from(src):
            for i, row in src.items():
                rrow = res.get(i)
                if rrow is None:
                    rrow = {}
                    res[i] = rrow
                for j, v in row.items():
                    rrow[j] = rrow.get(j, 0) + v
                    if rrow.get(j, 0) == 0:
                        rrow.pop(j, None)
        add_from(Arows)
        add_from(Brows)
        return to_triplets(res)

    if op == "multiply":
        if ca != rb:
            raise ValueError("dimension mismatch for multiply")
        res = {}
        for i, arow in Arows.items():
            rrow = res.get(i)
            if rrow is None:
                rrow = {}
                res[i] = rrow
            for k, av in arow.items():
                brow = Brows.get(k)
                if not brow:
                    continue
                for j, bv in brow.items():
                    rrow[j] = rrow.get(j, 0) + av * bv
                    if rrow.get(j, 0) == 0:
                        rrow.pop(j, None)
        return to_triplets(res)

    raise ValueError("unsupported op")
Explanation
We store each matrix as a dictionary-of-keys mapping row -> {col: value}. During normalization we aggregate duplicates and drop zeros so all operations work on canonical sparse rows. Addition merges the two row dictionaries by key, maintaining sparsity and removing zero sums. Multiplication iterates over nonzeros of a row i in A; for each (i,k) we look up row k in B and accumulate contributions into row i of the result (i.e., C[i,j] += A[i,k] * B[k,j]). Output is converted to a sorted list of triplets for add/multiply or a newline-joined string for printA/printB.

Time complexity: Add: O(nnz(A) + nnz(B)); Multiply: O(sum over k of nnz_in_col_A[k] * nnz_in_row_B[k]). Space complexity: O(nnz(A) + nnz(B) + nnz(result)).

Hints

  1. Use a dictionary-of-keys: map row -> {col: val} for fast aggregation and iteration.
  2. Normalize inputs by summing duplicates and dropping zeros before any operation.
  3. For multiplication, iterate rows of A and, for each (i,k), combine with row k of B.
  4. Build the result sparsely and skip zero contributions to avoid dense blow-up.
  5. Sort by (row, col) once at the end to produce a canonical output.
Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Maximize Boxes Stored Through One Entrance - Pinterest (medium)
  • Solve Multiple Coding Interview Problems - Pinterest (medium)
  • Implement a Sparse Matrix Class - Pinterest (medium)
  • Assign Pins to Shortest Columns - Pinterest (medium)
  • Design Hierarchical Permission Checks - Pinterest (medium)