Explain dataset size, generalization, and U-Net skips
Company: Apple
Role: Machine Learning Engineer
Category: Machine Learning
Difficulty: medium
Interview Round: Technical Screen
You are interviewing for an ML Engineer role in an image/video team. Answer the following conceptual questions clearly and concisely.
## 1) Small vs. large datasets
- What are the typical advantages/disadvantages of training on a **small** dataset vs a **large** dataset?
- How do compute, labeling cost, noise, and distribution shift change your strategy?
## 2) Overfitting vs. underfitting
- Define **overfitting** and **underfitting**.
- Give practical signs you would observe in training/validation curves.
- Provide common fixes for each.
## 3) How do you approach a new modeling task?
Describe your end-to-end approach when developing a model for a new task:
- Do you start with a small model and small dataset? Why or why not?
- What baselines do you set?
- How do you iterate (data, model, loss, evaluation)?
## 4) Super-resolution losses and blurriness
In single-image super-resolution:
- Compare optimizing **pixel-wise MSE (L2)** loss vs a **perceptual loss** (e.g., feature-space loss using a pretrained network).
- Which objective more commonly produces **blurry** images, and why?
## 5) U-Net architecture
- Explain the U-Net architecture at a high level (encoder/decoder, downsampling/upsampling, skip connections).
- What is the purpose of the **skip connections**?
- What typically happens if you **remove** skip connections in a U-Net used for dense prediction (segmentation), in terms of optimization and output quality?
Quick Answer: This question evaluates understanding of core machine learning and computer vision competencies — including dataset size trade-offs, generalization and overfitting/underfitting dynamics, end-to-end modeling strategy, loss function effects in single-image super-resolution, and the role of U-Net skip connections.