Adam Sioud | Recognizers

[recognizers] Automata, a PDA JSON Parser, a Surgic View on LLM Tool Calling

DFA, DPDA, PARSER COMBINATORS, JSON, TOOL CALLING

IThe Machines

A recognizer consumes input one token at a time and answers a yes/no question: does this string belong to the language? No output, no transformation - just accept or reject.

Every recognizer has three things:

Internal state - where the machine is right now
Transitions - rules for moving between states on input
Acceptance condition - which states count as "yes"

aDFA - Deterministic Finite Automaton

The simplest recognizer. A fixed set of states, a fixed set of transitions, no memory beyond which state you're in. The base class dispatches on the current state name - each state is a method:

python

class DeterministicFiniteAutomaton:

    def consume(self, token):
        return getattr(self, self.internal)(token)

A Raginald recognizer accepts "Reg" and "Reggie" - each character advances to the next state, with two acceptance points. A Binary recognizer accepts valid binary numbers: "0", "1", "10", "101". Two states after start - zero only accepts the single "0" (no leading zeros), oneOrMore loops on any digit.

What DFAs cannot do: they have no memory. They can't count. They can't match brackets. They can't recognize a^n b^n because that requires remembering how many a's you saw.

bDPDA - Deterministic Pushdown Automaton

A DFA plus a stack. The stack gives you memory - push, pop, peek, check if empty.

python

class DeterministicPushdownAutomaton:
    def __init__(self):
        self.internal = 'start'
        self.external = []  # the stack

The classic example: BalancedParentheses. Handles (), [], {} with nesting. See an opener? Push. See a closer? Check the top matches, pop. Reach END? Accept only if the stack is empty. Single state, all logic in the stack.

IIParser Combinators

A recognizer says yes or no. A parser says yes or no and returns what it matched. The building block is a matcher - a function that takes text and returns the matched portion, or False:

python

def literal(target, text):
    return target if text.startswith(target) else False

Combinators compose matchers. sequence chains them - all must succeed, each picks up where the last left off. longest tries all on the same input and returns the longest match.

This gets powerful with recursion. A balanced-parentheses grammar:

python

def balanced(text):
    return longest(
        partial(literal, "()"),
        sequence(partial(literal, "()"), balanced),
        sequence(partial(literal, "("), balanced, partial(literal, ")")),
        sequence(partial(literal, "("), balanced, partial(literal, ")"), balanced),
    )(text)

Same job as the DPDA BalancedParentheses class, but through function composition instead of explicit state and stack. Combinators are declarative and mirror the grammar directly. Automata give you explicit control and work better for streaming.

IIIThe Chomsky Hierarchy

These machines map to formal language classes. Each level strictly includes the one below:

Machine	Language class	Example
DFA	Regular	"Reggie", binary numbers
DPDA	Context-free	balanced brackets, JSON
LBA	Context-sensitive	`a^n b^n c^n`
TM	Recursively enumerable	anything computable

JSON is context-free - it needs a stack for nesting but nothing more. This is why a PDA (or recursive descent, which implicitly uses the call stack) is exactly the right tool for parsing it.

IVA JSON Parser by Hand

The JSON grammar is small - 6 value types, 9 token types - but exercises every important parsing concept: recursion, escaping, number formats, whitespace.

value   -> object | array | string | number | "true" | "false" | "null"
object  -> "{" "}" | "{" members "}"
members -> pair | pair "," members
pair    -> string ":" value
array   -> "[" "]" | "[" elements "]"
elements-> value | value "," elements

Six productions. Each one maps to a function.

aScanner + Parser

Split into two passes. The scanner walks raw characters and emits tokens:

{"name": "Adam", "age": 25}
  -> LEFT_BRACE STRING("name") COLON STRING("Adam")
     COMMA STRING("age") COLON NUMBER(25) RIGHT_BRACE

The parser then walks the token list and builds the value tree. It never sees whitespace or escape sequences - just clean tokens. Clean separation, easy error messages with positions.

bRecursive Descent

Single pass. Each grammar rule becomes a function. Functions call each other recursively:

python

def parse_value(s, i):
    i = skip_ws(s, i)
    if s[i] == '"':   return parse_string(s, i)
    if s[i] == '{':   return parse_object(s, i)
    if s[i] == '[':   return parse_array(s, i)
    if s[i] == 't':   return parse_true(s, i)
    if s[i] == 'f':   return parse_false(s, i)
    if s[i] == 'n':   return parse_null(s, i)
    return parse_number(s, i)

Every function takes the source string and current position, returns the parsed value and new position. No intermediate token list. Minimal code, fast, direct mapping from grammar to code.

VThe Thread to LLM Tool Calling

When an LLM calls a tool, it emits structured JSON - function name, arguments, types. The API parses that JSON back into objects the runtime can dispatch. This is exactly the pipeline we've been building.

An LLM tool call response looks like:

json

{
  "name": "lookup_order",
  "arguments": "{\"order_id\": \"12345\"}"
}

The runtime does json.loads(tool_call.function.arguments) to get a dict, looks up the function by name, unpacks the arguments, and calls it. Then packs the result back as a tool-result message for the next turn.

The connection to recognizers:

The LLM generates strings in a formal language (JSON conforming to a tool schema)
The runtime recognizes and parses those strings to extract structured data
Schema validation is recognition - does this JSON belong to the language defined by the schema?
Argument extraction is parsing - transform the recognized string into a callable form

When constrained decoding (structured output, grammar-guided generation) forces the LLM to only emit valid JSON, the LLM's token sampler is literally acting as a recognizer - it rejects any token that would leave the string outside the grammar. The sampler is a pushdown automaton.

The agentic loop itself mirrors the automaton pattern:

Automaton	Agent
State	Conversation history
Input token	User message / tool result
Transition	LLM inference (model call)
Stack (PDA)	Task decomposition / sub-agent calls
Tape (TM)	Scratchpad / artifact memory
Accept	Final answer / task complete

This is the thesis from terminal2F's design: an agent's computational class is determined by its memory architecture. A simple loop with bounded context is an FSM. Add a call stack (sub-agents, task decomposition) and you get a PDA. Add read/write scratchpad memory and you get a Turing machine.

Recognizers aren't just an academic exercise. They're the exact machinery running underneath every tool call your agent makes.

RECOGNIZERS · AUTOMATA, JSON PARSING, TOOL CALLING