[recognizers] Automata, a PDA JSON Parser, a Surgic View on LLM Tool Calling
IThe Machines
A recognizer consumes input one token at a time and answers a yes/no question: does this string belong to the language? No output, no transformation - just accept or reject.
Every recognizer has three things:
- Internal state - where the machine is right now
- Transitions - rules for moving between states on input
- Acceptance condition - which states count as "yes"
aDFA - Deterministic Finite Automaton
The simplest recognizer. A fixed set of states, a fixed set of transitions, no memory beyond which state you're in. The base class dispatches on the current state name - each state is a method:
class DeterministicFiniteAutomaton:
def consume(self, token):
return getattr(self, self.internal)(token)A Raginald recognizer accepts "Reg" and "Reggie" - each character advances to the next state, with two acceptance points. A Binary recognizer accepts valid binary numbers: "0", "1", "10", "101". Two states after start - zero only accepts the single "0" (no leading zeros), oneOrMore loops on any digit.
What DFAs cannot do: they have no memory. They can't count. They can't match brackets. They can't recognize a^n b^n because that requires remembering how many a's you saw.
bDPDA - Deterministic Pushdown Automaton
A DFA plus a stack. The stack gives you memory - push, pop, peek, check if empty.
class DeterministicPushdownAutomaton:
def __init__(self):
self.internal = 'start'
self.external = [] # the stackThe classic example: BalancedParentheses. Handles (), [], {} with nesting. See an opener? Push. See a closer? Check the top matches, pop. Reach END? Accept only if the stack is empty. Single state, all logic in the stack.
IIParser Combinators
A recognizer says yes or no. A parser says yes or no and returns what it matched. The building block is a matcher - a function that takes text and returns the matched portion, or False:
def literal(target, text):
return target if text.startswith(target) else FalseCombinators compose matchers. sequence chains them - all must succeed, each picks up where the last left off. longest tries all on the same input and returns the longest match.
This gets powerful with recursion. A balanced-parentheses grammar:
def balanced(text):
return longest(
partial(literal, "()"),
sequence(partial(literal, "()"), balanced),
sequence(partial(literal, "("), balanced, partial(literal, ")")),
sequence(partial(literal, "("), balanced, partial(literal, ")"), balanced),
)(text)Same job as the DPDA BalancedParentheses class, but through function composition instead of explicit state and stack. Combinators are declarative and mirror the grammar directly. Automata give you explicit control and work better for streaming.
IIIThe Chomsky Hierarchy
These machines map to formal language classes. Each level strictly includes the one below:
| Machine | Language class | Example |
|---|---|---|
| DFA | Regular | "Reggie", binary numbers |
| DPDA | Context-free | balanced brackets, JSON |
| LBA | Context-sensitive | a^n b^n c^n |
| TM | Recursively enumerable | anything computable |
JSON is context-free - it needs a stack for nesting but nothing more. This is why a PDA (or recursive descent, which implicitly uses the call stack) is exactly the right tool for parsing it.
IVA JSON Parser by Hand
The JSON grammar is small - 6 value types, 9 token types - but exercises every important parsing concept: recursion, escaping, number formats, whitespace.
value -> object | array | string | number | "true" | "false" | "null"
object -> "{" "}" | "{" members "}"
members -> pair | pair "," members
pair -> string ":" value
array -> "[" "]" | "[" elements "]"
elements-> value | value "," elements
Six productions. Each one maps to a function.
aScanner + Parser
Split into two passes. The scanner walks raw characters and emits tokens:
{"name": "Adam", "age": 25}
-> LEFT_BRACE STRING("name") COLON STRING("Adam")
COMMA STRING("age") COLON NUMBER(25) RIGHT_BRACE
The parser then walks the token list and builds the value tree. It never sees whitespace or escape sequences - just clean tokens. Clean separation, easy error messages with positions.
bRecursive Descent
Single pass. Each grammar rule becomes a function. Functions call each other recursively:
def parse_value(s, i):
i = skip_ws(s, i)
if s[i] == '"': return parse_string(s, i)
if s[i] == '{': return parse_object(s, i)
if s[i] == '[': return parse_array(s, i)
if s[i] == 't': return parse_true(s, i)
if s[i] == 'f': return parse_false(s, i)
if s[i] == 'n': return parse_null(s, i)
return parse_number(s, i)Every function takes the source string and current position, returns the parsed value and new position. No intermediate token list. Minimal code, fast, direct mapping from grammar to code.
VThe Thread to LLM Tool Calling
When an LLM calls a tool, it emits structured JSON - function name, arguments, types. The API parses that JSON back into objects the runtime can dispatch. This is exactly the pipeline we've been building.
An LLM tool call response looks like:
{
"name": "lookup_order",
"arguments": "{\"order_id\": \"12345\"}"
}The runtime does json.loads(tool_call.function.arguments) to get a dict, looks up the function by name, unpacks the arguments, and calls it. Then packs the result back as a tool-result message for the next turn.
The connection to recognizers:
- The LLM generates strings in a formal language (JSON conforming to a tool schema)
- The runtime recognizes and parses those strings to extract structured data
- Schema validation is recognition - does this JSON belong to the language defined by the schema?
- Argument extraction is parsing - transform the recognized string into a callable form
When constrained decoding (structured output, grammar-guided generation) forces the LLM to only emit valid JSON, the LLM's token sampler is literally acting as a recognizer - it rejects any token that would leave the string outside the grammar. The sampler is a pushdown automaton.
The agentic loop itself mirrors the automaton pattern:
| Automaton | Agent |
|---|---|
| State | Conversation history |
| Input token | User message / tool result |
| Transition | LLM inference (model call) |
| Stack (PDA) | Task decomposition / sub-agent calls |
| Tape (TM) | Scratchpad / artifact memory |
| Accept | Final answer / task complete |
This is the thesis from terminal2F's design: an agent's computational class is determined by its memory architecture. A simple loop with bounded context is an FSM. Add a call stack (sub-agents, task decomposition) and you get a PDA. Add read/write scratchpad memory and you get a Turing machine.
Recognizers aren't just an academic exercise. They're the exact machinery running underneath every tool call your agent makes.
RECOGNIZERS · AUTOMATA, JSON PARSING, TOOL CALLING