Design an image copyright-violation detection system

Q: Design an image copyright-violation detection system

This is a ML System Design interview question from Meta for Machine Learning Engineer roles. View the full question and solution on PracHub.

Q: How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

Question

Loading...

Design an ML system that detects whether a user-uploaded image violates copyright.

Requirements

Input: an image uploaded by a user (optionally with text/caption and user/account metadata).
Output: a decision such as {allow, block, human_review} plus an explanation signal (e.g., matched copyrighted work ID, similarity score, region of match).
Must scale to large numbers of uploads and a large database of copyrighted/reference images.

Follow-up questions to address

How would you fine-tune a pretrained vision(-language) model used for embeddings or classification?
If an uploaded image is a 3×3 grid (collage) and only one tile is infringing, how do you detect that?
How do you handle adversarial text overlays or other adversarial manipulations intended to evade detection?
If the original work is copyrighted, does a photo of the work (e.g., taken by a camera from a screen/poster) count as infringement, and how should the system treat such near-duplicates?

Assume you can combine retrieval, classification, and human review, and you must justify metrics, thresholds, and latency/cost tradeoffs.

Design an image copyright-violation detection system

Requirements

Follow-up questions to address

Solution

Comments (0)