Implement a polyfill for JSON.parse
A recursive descent parser: tokenize the JSON string, then parse values by type (object, array, string, number, true/false/null). Handle whitespace, escape sequences, nesting, and throw SyntaxError on malformed input. The realistic answer is explaining the parser structure, not a flawless full implementation.
JSON.parse is a recursive descent parser — and the interview is mostly about whether you can describe the parser structure clearly, not produce a bug-free 200-line implementation.
The structure: tokenize, then parse recursively
You walk the string with a position cursor, and have a function per JSON value type, each recursing into the others:
function jsonParse(str) {
let i = 0; // cursor
function skipWhitespace() {
while (i < str.length && /\s/.test(str[i])) i++;
}
function parseValue() {
skipWhitespace();
const ch = str[i];
if (ch === "{") return parseObject();
if (ch === "[") return parseArray();
if (ch === '"') return parseString();
if (ch === "-" || (ch >= "0" && ch <= "9")) return parseNumber();
if (str.startsWith("true", i)) { i += 4; return true; }
if (str.startsWith("false", i)) { i += 5; return false; }
if (str.startsWith("null", i)) { i += 4; return null; }
throw new SyntaxError(`Unexpected token at position ${i}`);
}
function parseObject() {
const obj = {};
i++; // consume '{'
skipWhitespace();
if (str[i] === "}") { i++; return obj; }
while (true) {
skipWhitespace();
const key = parseString(); // keys are JSON strings
skipWhitespace();
if (str[i] !== ":") throw new SyntaxError("Expected ':'");
i++;
obj[key] = parseValue(); // recurse
skipWhitespace();
if (str[i] === ",") { i++; continue; }
if (str[i] === "}") { i++; return obj; }
throw new SyntaxError("Expected ',' or '}'");
}
}
function parseArray() {
const arr = [];
i++; // consume '['
skipWhitespace();
if (str[i] === "]") { i++; return arr; }
while (true) {
arr.push(parseValue()); // recurse
skipWhitespace();
if (str[i] === ",") { i++; continue; }
if (str[i] === "]") { i++; return arr; }
throw new SyntaxError("Expected ',' or ']'");
}
}
// parseString — handle escape sequences (\", \\, \n, \t, \uXXXX...)
// parseNumber — handle sign, integer, fraction, exponent
const result = parseValue();
skipWhitespace();
if (i < str.length) throw new SyntaxError("Unexpected trailing characters");
return result;
}The details / gotchas
- Recursive descent —
parseValuedispatches by the next char;parseObject/parseArraycallparseValuefor their contents → handles arbitrary nesting. - Whitespace — skip it between every token.
- Strings — the fiddly part: escape sequences (
\",\\,\n,\t,\uXXXX). - Numbers — sign, integer part, optional fraction, optional exponent.
- Strict JSON — no trailing commas, no comments, no single quotes, keys must be double-quoted strings — malformed input must throw
SyntaxError. - Trailing characters after the root value → error.
- The real
JSON.parsealso takes a reviver function — mention it; usually out of scope.
The framing
"It's a recursive descent parser. I walk the string with a cursor and have a parseValue that dispatches on the next character to parseObject, parseArray, parseString, parseNumber, or the literal keywords — and object/array parsers recurse back into parseValue, which is how arbitrary nesting works. The fiddly bits are string escape sequences and number formats, skipping whitespace between tokens, and being strict — no trailing commas or comments, throw SyntaxError on anything malformed, including trailing characters after the root value. I'd describe that structure clearly rather than claim a flawless full implementation."
Follow-up questions
- •What is recursive descent parsing?
- •What makes parsing strings the tricky part?
- •What inputs are valid JSON.stringify output but invalid for a lenient parser?
- •What does the reviver function argument do?
Common mistakes
- •Just calling eval() — unsafe and not really 'parsing'.
- •Not skipping whitespace between tokens.
- •Mishandling string escape sequences.
- •Accepting invalid JSON (trailing commas, comments, single quotes).
- •Not erroring on trailing characters after the root value.
Performance considerations
- •O(n) over the input string. Deeply nested JSON can hit recursion limits; the native JSON.parse is implemented in C++ and far faster — the polyfill is an exercise in understanding parsing, not a production replacement.
Edge cases
- •Deeply nested structures (recursion depth).
- •Escape sequences and unicode \uXXXX in strings.
- •Numbers with exponents and negative signs.
- •Empty objects/arrays.
- •Malformed input — must throw SyntaxError, not return garbage.
Real-world examples
- •Understanding why a trailing comma breaks JSON.parse.
- •Recursive descent is the same structure used for many real parsers.