Write queries to compute salary and budget stats
Company: Apple
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: easy
Interview Round: Onsite
You are given the following interview tasks. Write solutions in **SQL and/or Python (pandas)** as appropriate.
## Task 1 — Second highest salary
You have a table:
**employees**
- `employee_id` INT (PK)
- `name` VARCHAR
- `salary` INT
Return the **second highest distinct salary**.
- If there is **no** second distinct salary, return `NULL`.
**Required output**
- `second_highest_salary` (INT or NULL)
## Task 2 — Merge two datasets
You are given two datasets that share a common key:
**users**
- `user_id` INT (PK)
- `country` VARCHAR
- `signup_date` DATE
**user_events**
- `user_id` INT (FK → users.user_id)
- `event_time` TIMESTAMP
- `event_type` VARCHAR
Merge them so that each event row is enriched with user attributes.
- Keep only events that have a matching `user_id` in `users` (inner join).
**Required output**
- `user_id`, `country`, `signup_date`, `event_time`, `event_type`
## Task 3 — “Run out the budget” (maximize hires)
You have a list/table of employees with their salaries and a total hiring budget `B`.
- Each employee costs exactly their `salary`.
- You can hire at most once per employee.
- Goal: **maximize the number of employees hired** without exceeding the budget.
**Input**
- `employees(employee_id, salary)` and an integer `B`
**Required output**
- `max_hires` (INT)
Clarify any assumptions you need (e.g., what to do if `B <= 0`).
Quick Answer: This set of tasks evaluates data manipulation and analytical competencies, specifically the ability to compute order-statistics on numeric columns, perform relational joins to enrich event records, and reason about constrained selection for maximizing hires using SQL and pandas.