Decode a JWT string without libraries
Company: Microsoft
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Onsite
Quick Answer: This question evaluates understanding of JWT structure, Base64URL encoding, string manipulation, and JSON parsing as applied to decoding token components. It is commonly asked to assess practical implementation abilities and edge-case/error handling in the Coding & Algorithms domain, emphasizing practical application over purely conceptual reasoning.
Constraints
- 0 <= len(token) <= 20000
- A valid token must contain exactly three dot-separated segments.
- The header and payload segments must be valid Base64URL strings after accounting for optional trailing `=` padding.
- Decoded header and payload bytes must be valid UTF-8 and must parse as JSON objects.
- The signature segment is not decoded or verified.
Examples
Input: ('eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWV9.c2lnbmF0dXJl',)
Expected Output: ['{"alg":"HS256","typ":"JWT"}', '{"sub":"1234567890","name":"John Doe","admin":true}']
Explanation: The first two segments decode to valid UTF-8 JSON objects. The signature is present but ignored.
Input: ('eyJhIjoxfQ.eyJiIjoiaGkifQ.c2ln',)
Expected Output: ['{"a":1}', '{"b":"hi"}']
Explanation: Both header and payload omit Base64 padding, but they still decode correctly.
Input: ('eyJlIjoi8J-ZgiJ9.e30.c2ln',)
Expected Output: ['{"e":"\U0001f642"}', '{}']
Explanation: The header contains a UTF-8 emoji, and its Base64URL representation includes the URL-safe `-` character.
Input: ('e30.e30.',)
Expected Output: ['{}', '{}']
Explanation: There are exactly three segments; the empty signature segment is allowed because signatures are not verified.
Input: ('e30.bm90IGpzb24.c2ln',)
Expected Output: None
Explanation: The payload decodes to the text `not json`, which is not valid JSON.
Input: ('e30!.e30.c2ln',)
Expected Output: None
Explanation: The header segment contains `!`, which is not part of the Base64URL alphabet.
Input: ('abc.def',)
Expected Output: None
Explanation: The token has only two segments instead of three.
Input: ('',)
Expected Output: None
Explanation: An empty string does not contain three JWT segments.
Hints
- Base64URL characters represent 6-bit values. Process full groups of four characters into three bytes, then handle remainders of two or three characters.
- A Base64URL segment whose unpadded length is congruent to 1 modulo 4 cannot be valid.