PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/ML System Design/Cribl

Design an LLM Log Parsing Workflow

Last updated: May 14, 2026

Quick Overview

This question evaluates skills in ML-enabled log parsing, schema inference, structured data extraction, and designing scalable, reliable production workflows that combine probabilistic LLMs with deterministic parsers and operational engineering.

  • medium
  • Cribl
  • ML System Design
  • Software Engineer

Design an LLM Log Parsing Workflow

Company: Cribl

Role: Software Engineer

Category: ML System Design

Difficulty: medium

Interview Round: Technical Screen

Design a production workflow that uses an LLM, optionally combined with deterministic parsers, to convert heterogeneous raw log messages into structured JSON fields. The system must support multiple log formats whose schemas may be very different. Example 1: access log input: ```text 192.168.1.1 - - [10/Oct/2023:13:55:36 +0000] "GET /index.html HTTP/1.1" 200 1024 "http://example.com/start.html" "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)" ``` Expected structured output: ```json { "src_ip": "192.168.1.1", "time": "10/Oct/2023:13:55:36 +0000", "http_method": "GET", "path": "/index.html", "protocol": "HTTP/1.1", "response_code": 200, "duration": 1024, "url": "http://example.com/start.html", "userAgent": "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)" } ``` Example 2: error log input: ```text [Tue Oct 10 13:55:36 2023] [error] [pid 12345] [client 192.168.1.1:12345] File does not exist: /var/www/html/favicon.ico ``` This log should produce a different schema, for example fields such as `timestamp`, `level`, `pid`, `client_ip`, `client_port`, and `error_message`. Discuss the architecture, data flow, schema inference, extraction strategy, validation, scaling, reliability, monitoring, privacy, and how you would evaluate quality.

Quick Answer: This question evaluates skills in ML-enabled log parsing, schema inference, structured data extraction, and designing scalable, reliable production workflows that combine probabilistic LLMs with deterministic parsers and operational engineering.

Cribl logo
Cribl
Jan 28, 2026, 12:00 AM
Software Engineer
Technical Screen
ML System Design
0
0

Design a production workflow that uses an LLM, optionally combined with deterministic parsers, to convert heterogeneous raw log messages into structured JSON fields.

The system must support multiple log formats whose schemas may be very different.

Example 1: access log input:

192.168.1.1 - - [10/Oct/2023:13:55:36 +0000] "GET /index.html HTTP/1.1" 200 1024 "http://example.com/start.html" "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)"

Expected structured output:

{
  "src_ip": "192.168.1.1",
  "time": "10/Oct/2023:13:55:36 +0000",
  "http_method": "GET",
  "path": "/index.html",
  "protocol": "HTTP/1.1",
  "response_code": 200,
  "duration": 1024,
  "url": "http://example.com/start.html",
  "userAgent": "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)"
}

Example 2: error log input:

[Tue Oct 10 13:55:36 2023] [error] [pid 12345] [client 192.168.1.1:12345] File does not exist: /var/www/html/favicon.ico

This log should produce a different schema, for example fields such as timestamp, level, pid, client_ip, client_port, and error_message.

Discuss the architecture, data flow, schema inference, extraction strategy, validation, scaling, reliability, monitoring, privacy, and how you would evaluate quality.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Cribl•More Software Engineer•Cribl Software Engineer•Cribl ML System Design•Software Engineer ML System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.