PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/Meta

Implement string and basic ML algorithms

Last updated: Mar 29, 2026

Quick Overview

This question evaluates proficiency in algorithmic string processing and implementation of foundational machine learning models, covering skills such as designing an efficient distinct-character substring algorithm, implementing binary logistic regression training and prediction, and building a multinomial Naive Bayes text classifier.

  • medium
  • Meta
  • Coding & Algorithms
  • Software Engineer

Implement string and basic ML algorithms

Company: Meta

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

You are given three implementation tasks that mix algorithms and basic machine learning models. --- ### Task 1: Longest substring with all distinct characters Given a string `s` consisting of ASCII characters, find the length of the longest **contiguous substring** in which **all characters are distinct** (no character appears more than once within that substring). - Input: a single string `s`. - Output: an integer representing the maximum length of any substring of `s` that has all unique characters. **Example:** - `s = abcabcbb` → the answer is `3` (one such substring is `abc`). - `s = bbbbb` → the answer is `1` (any single `b`). - `s = ""` (empty string) → the answer is `0`. **Constraints (typical):** - `0 <= len(s) <= 10^5` - Expected time: `O(n)` where `n = len(s)` - Expected extra space: `O(min(n, alphabet_size))` Implement a function with a signature similar to: - `int longestDistinctSubstringLength(const string& s)` (C++), or - `int longest_distinct_substring_length(string s)` (Python-style pseudocode). --- ### Task 2: Implement binary logistic regression from scratch Implement a **binary logistic regression classifier** without using any machine learning libraries. Design a class `LogisticRegression` that supports: - **Constructor:** takes at least: - `learning_rate: float` - `num_iterations: int` - **fit(X, y):** - `X` is an `n x d` numeric matrix of features (e.g., list of length `n`, each element a length-`d` list of floats). - `y` is a length-`n` list/array of labels, each label is either `0` or `1`. - Initialize weights and bias (e.g., to zeros). - Train using **batch gradient descent** to minimize the **logistic (cross-entropy) loss**: - Model: for each sample `i`, compute `z_i = w · x_i + b`, then probability of class 1 is `p_i = sigmoid(z_i) = 1 / (1 + exp(-z_i))`. - Loss over the dataset: \[ L(w, b) = -\frac{1}{n} \sum_{i=1}^n \Bigl( y_i \log p_i + (1 - y_i) \log(1 - p_i) \Bigr). \] - Update `w` and `b` with gradient descent for `num_iterations` steps. - No regularization is required. - **predict_proba(X):** - Given a feature matrix `X`, return a list/array of predicted probabilities `p_i` for class `1` for each row. - **predict(X):** - Given `X`, first compute probabilities with `predict_proba`, then convert to labels `0` or `1` using threshold `0.5`. **Constraints (typical):** - `n` (number of samples) up to about `10^4`. - `d` (number of features) up to about `10^2`. - Number of iterations up to around `10^4`. You may implement in any language, but do **not** use built-in ML training utilities. --- ### Task 3: Implement a multinomial Naive Bayes text classifier Implement a **multinomial Naive Bayes** classifier for text classification. Assume: - Training data: `docs_train`, a list of documents, where each document is represented as a list of token strings, e.g. `[["this", "movie", "is", "great"], ["awful", "film"], ...]`. - Training labels: `y_train`, a list of integers where `y_train[i]` is the class index for `docs_train[i]` (values in `{0, 1, ..., C-1}`). - Test data: `docs_test`, same format as `docs_train` but without labels. Design a class `NaiveBayes` that supports: - **fit(docs_train, y_train):** - Build a vocabulary of all tokens seen in `docs_train`. - Let `C` be the number of classes. - Estimate class prior probabilities: \[ P(y = c) = \frac{\text{count of training docs with label } c}{\text{total number of docs}}. \] - For each class `c` and token `w` in the vocabulary, estimate conditional probabilities with **Laplace smoothing** (add-one smoothing): \[ P(w \mid y = c) = \frac{\text{total count of token } w \text{ in class } c + 1}{\text{total token count in class } c + V}, \] where `V` is the vocabulary size. - Store log probabilities (`log P(y = c)` and `log P(w | y = c)`) to avoid underflow. - **predict(docs_test):** - For each document, treat it as a bag of words. - For each class `c`, compute the log posterior up to a constant: \[ \log P(y = c \mid \text{doc}) \propto \log P(y = c) + \sum_{w \in \text{doc}} \text{count}(w \text{ in doc}) \cdot \log P(w \mid y = c). \] - Predict the class `c` with the highest log score. - Ignore tokens not in the training vocabulary (or handle them with an appropriate strategy such as skipping them). **Constraints (typical):** - Number of documents up to a few `10^4`. - Vocabulary size up to a few `10^4`. - Must be efficient in both time and memory. Focus on correctly implementing the core training (`fit`) and prediction (`predict`) logic; you do not need to handle file I/O or text preprocessing (assume tokenization is already done).

Quick Answer: This question evaluates proficiency in algorithmic string processing and implementation of foundational machine learning models, covering skills such as designing an efficient distinct-character substring algorithm, implementing binary logistic regression training and prediction, and building a multinomial Naive Bayes text classifier.

Related Interview Questions

  • Solve Tree Columns And Maze Variants - Meta (medium)
  • Solve Tree Diameter and Palindromic Counts - Meta (medium)
  • Simulate Monster Team Battles - Meta (hard)
  • Solve a Key-Door Corridor Maze - Meta (medium)
  • Solve Array Merge and Parentheses Cleanup - Meta (medium)
Meta logo
Meta
Dec 8, 2025, 12:00 AM
Software Engineer
Technical Screen
Coding & Algorithms
3
0

You are given three implementation tasks that mix algorithms and basic machine learning models.

Task 1: Longest substring with all distinct characters

Given a string s consisting of ASCII characters, find the length of the longest contiguous substring in which all characters are distinct (no character appears more than once within that substring).

  • Input: a single string s .
  • Output: an integer representing the maximum length of any substring of s that has all unique characters.

Example:

  • s = abcabcbb → the answer is 3 (one such substring is abc ).
  • s = bbbbb → the answer is 1 (any single b ).
  • s = "" (empty string) → the answer is 0 .

Constraints (typical):

  • 0 <= len(s) <= 10^5
  • Expected time: O(n) where n = len(s)
  • Expected extra space: O(min(n, alphabet_size))

Implement a function with a signature similar to:

  • int longestDistinctSubstringLength(const string& s) (C++), or
  • int longest_distinct_substring_length(string s) (Python-style pseudocode).

Task 2: Implement binary logistic regression from scratch

Implement a binary logistic regression classifier without using any machine learning libraries.

Design a class LogisticRegression that supports:

  • Constructor: takes at least:
    • learning_rate: float
    • num_iterations: int
  • fit(X, y):
    • X is an n x d numeric matrix of features (e.g., list of length n , each element a length- d list of floats).
    • y is a length- n list/array of labels, each label is either 0 or 1 .
    • Initialize weights and bias (e.g., to zeros).
    • Train using batch gradient descent to minimize the logistic (cross-entropy) loss :
      • Model: for each sample i , compute z_i = w · x_i + b , then probability of class 1 is p_i = sigmoid(z_i) = 1 / (1 + exp(-z_i)) .
      • Loss over the dataset:

L(w,b)=−1n∑i=1n(yilog⁡pi+(1−yi)log⁡(1−pi)).L(w, b) = -\frac{1}{n} \sum_{i=1}^n \Bigl( y_i \log p_i + (1 - y_i) \log(1 - p_i) \Bigr).L(w,b)=−n1​∑i=1n​(yi​logpi​+(1−yi​)log(1−pi​)).

- Update `w` and `b` with gradient descent for `num_iterations` steps.
  • No regularization is required.
  • predict_proba(X):
    • Given a feature matrix X , return a list/array of predicted probabilities p_i for class 1 for each row.
  • predict(X):
    • Given X , first compute probabilities with predict_proba , then convert to labels 0 or 1 using threshold 0.5 .

Constraints (typical):

  • n (number of samples) up to about 10^4 .
  • d (number of features) up to about 10^2 .
  • Number of iterations up to around 10^4 .

You may implement in any language, but do not use built-in ML training utilities.

Task 3: Implement a multinomial Naive Bayes text classifier

Implement a multinomial Naive Bayes classifier for text classification.

Assume:

  • Training data: docs_train , a list of documents, where each document is represented as a list of token strings, e.g. [["this", "movie", "is", "great"], ["awful", "film"], ...] .
  • Training labels: y_train , a list of integers where y_train[i] is the class index for docs_train[i] (values in {0, 1, ..., C-1} ).
  • Test data: docs_test , same format as docs_train but without labels.

Design a class NaiveBayes that supports:

  • fit(docs_train, y_train):
    • Build a vocabulary of all tokens seen in docs_train .
    • Let C be the number of classes.
    • Estimate class prior probabilities:

P(y=c)=count of training docs with label ctotal number of docs.P(y = c) = \frac{\text{count of training docs with label } c}{\text{total number of docs}}.P(y=c)=total number of docscount of training docs with label c​.

  • For each class c and token w in the vocabulary, estimate conditional probabilities with Laplace smoothing (add-one smoothing):

P(w∣y=c)=total count of token w in class c+1total token count in class c+V,P(w \mid y = c) = \frac{\text{total count of token } w \text{ in class } c + 1}{\text{total token count in class } c + V},P(w∣y=c)=total token count in class c+Vtotal count of token w in class c+1​,

where `V` is the vocabulary size.
  • Store log probabilities ( log P(y = c) and log P(w | y = c) ) to avoid underflow.
  • predict(docs_test):
    • For each document, treat it as a bag of words.
    • For each class c , compute the log posterior up to a constant:

log⁡P(y=c∣doc)∝log⁡P(y=c)+∑w∈doccount(w in doc)⋅log⁡P(w∣y=c).\log P(y = c \mid \text{doc}) \propto \log P(y = c) + \sum_{w \in \text{doc}} \text{count}(w \text{ in doc}) \cdot \log P(w \mid y = c).logP(y=c∣doc)∝logP(y=c)+∑w∈doc​count(w in doc)⋅logP(w∣y=c).

  • Predict the class c with the highest log score.
  • Ignore tokens not in the training vocabulary (or handle them with an appropriate strategy such as skipping them).

Constraints (typical):

  • Number of documents up to a few 10^4 .
  • Vocabulary size up to a few 10^4 .
  • Must be efficient in both time and memory.

Focus on correctly implementing the core training (fit) and prediction (predict) logic; you do not need to handle file I/O or text preprocessing (assume tokenization is already done).

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Meta•More Software Engineer•Meta Software Engineer•Meta Coding & Algorithms•Software Engineer Coding & Algorithms
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.