PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Qube

Implement and Tune KNN Classifier

Last updated: May 6, 2026

Quick Overview

This question evaluates a candidate's competency in implementing and tuning the K-nearest neighbors algorithm, covering data preprocessing, distance-based classification, model evaluation, and hyperparameter selection.

  • hard
  • Qube
  • Machine Learning
  • Data Scientist

Implement and Tune KNN Classifier

Company: Qube

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Take-home Project

You are given two CSV files for a three-class leaf-classification task. - `train.csv` contains three numeric feature columns: `feature_0`, `feature_1`, and `feature_2`, plus a class label column. - `unlabeled.csv` contains the same three feature columns but no labels. Complete the task in a Jupyter notebook: 1. Load and inspect the labeled dataset. 2. Split the labeled data into training and testing sets. 3. Implement a K-nearest neighbors classifier from scratch. You may use standard data-processing libraries, but the KNN prediction logic should be your own implementation. 4. Train the classifier and evaluate its accuracy on the held-out test set. 5. Tune hyperparameters, especially the value of `k`, and try to improve accuracy. You may use cross-validation. 6. Use the final model to predict labels for `unlabeled.csv`. 7. Write the predictions to an output CSV file.

Quick Answer: This question evaluates a candidate's competency in implementing and tuning the K-nearest neighbors algorithm, covering data preprocessing, distance-based classification, model evaluation, and hyperparameter selection.

Related Interview Questions

  • Compare bagging, boosting, random forests, and bias-variance - Qube (hard)
Qube logo
Qube
Jan 12, 2026, 12:00 AM
Data Scientist
Take-home Project
Machine Learning
0
0

You are given two CSV files for a three-class leaf-classification task.

  • train.csv contains three numeric feature columns: feature_0 , feature_1 , and feature_2 , plus a class label column.
  • unlabeled.csv contains the same three feature columns but no labels.

Complete the task in a Jupyter notebook:

  1. Load and inspect the labeled dataset.
  2. Split the labeled data into training and testing sets.
  3. Implement a K-nearest neighbors classifier from scratch. You may use standard data-processing libraries, but the KNN prediction logic should be your own implementation.
  4. Train the classifier and evaluate its accuracy on the held-out test set.
  5. Tune hyperparameters, especially the value of k , and try to improve accuracy. You may use cross-validation.
  6. Use the final model to predict labels for unlabeled.csv .
  7. Write the predictions to an output CSV file.

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Qube•More Data Scientist•Qube Data Scientist•Qube Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.