Panda Parse is a general-purpose parser library that helps you convert text into structured meaning — known as an Abstract Syntax Tree (AST). It’s designed for building custom languages, expression evaluators, config parsers, style DSLs, and more.
An AST (Abstract Syntax Tree) is a structured representation of your input — like a nested object that reflects the grammar of the language you're parsing.
For example, parsing this expression:
2 + 3
...might produce this AST:
{
type: "Add",
left: { type: "Number", value: 2 },
right: { type: "Number", value: 3 }
}Once you have an AST, you can:
- Compile it into another language
- Evaluate it directly
- Transform it into another format
Panda Parse makes it easy to build these kinds of trees, using simple class definitions and grammar rules.
You can install Panda Parse via npm:
npm install panda-parseTo start using Panda Parse in your project, import the core classes:
import { $AST, Lexer, Shape } from "panda-parse";These are the three essential pieces:
Lexer— splits the input into a stream of tokens$AST— base class for your custom syntax tree nodesShape— defines the grammar pattern for each AST node
Note that in panda parse the $NAME convention is used for all ASTs.
Let’s build a simple parser that recognizes whole numbers.
class $NUMBER extends $AST {
static SHAPE = new Shape(/^\d+/); // Match one or more digits
}This creates an AST class that matches numeric strings like "42" or "123". Notice the use of a regular expression here, you can also use plain strings.
const lexer = new Lexer("42");
const ast = $NUMBER.parse(lexer);
console.log(ast.text); // Output: "42"Here’s what’s happening:
Lexer("42")creates a stream of tokens starting at the beginning of the input$NUMBER.parse(...)tries to match the shape from the current lexer position- The result is an AST node with the
.textvalue"42"
Now that you've built a basic number parser, let’s expand our grammar to support binary expressions like 2 + 3.
We’ll define an AST node for a + b where both sides are numbers.
class $ADD extends $AST {
static SHAPE = new Shape($NUMBER, "+", $NUMBER);
}This shape matches:
- a
$NUMBER - the
"+"symbol - another
$NUMBER
You can now parse:
const ast = $ADD.parse(new Lexer("2+3"));
console.log(ast.contentExps.map((e) => e.text)); // ["2", "+", "3"]In the example above we map the contentExps, those are all the sub-expressions in the AST (See $AST api documentation below for more info).
Each part of the shape corresponds to a token or sub-expression:
contentExps[0]→ left-hand numbercontentExps[1]→ the"+"operatorcontentExps[2]→ right-hand number
You can optionally give your AST nodes a method to evaluate or transform the tree:
class $ADD extends $AST {
static SHAPE = new Shape($NUMBER, "+", $NUMBER);
toJS() {
const [left, , right] = this.contentExps;
return Number(left.text) + Number(right.text);
}
}
console.log($ADD.parse(new Lexer("10+20")).toJS()); // 30This is useful for compiling, interpreting, or transforming your language.
Let’s define a similar node for multiplication:
class $MULTIPLY extends $AST {
static SHAPE = new Shape($NUMBER, "*", $NUMBER);
}You can now parse:
const ast = $MULTIPLY.parse(new Lexer("4*5"));
console.log(ast.contentExps.map((e) => e.text)); // ["4", "*", "5"]By default, whitespace is ignored between tokens. So all of these will work:
2+32 + 32 + 3
No extra setup needed — Panda Parse handles this for you.
If you want to support chained expressions like 1 + 2 + 3, you can make your class recursive by referencing this in its own shape:
class $ADD extends $AST {
static SHAPE = new Shape($NUMBER, "+", this);
}This allows inputs like:
const ast = $ADD.parse(new Lexer("1+2+3"));You’ve now built:
- A number matcher
- An addition AST node
- A multiplication AST node
- A recursive version of addition
In the next section, you’ll learn how to build grouped expressions like (1 + 2) and how to compose a full grammar that supports all operations.
In this final section of the beginner tutorial, you’ll build support for parentheses, then tie everything together into a complete expression parser that can handle numbers, operators, and groups like (1 + 2) * 3.
We want to support input like:
(1 + 2)
To do this, we create a new AST class that expects:
- a
"(" - a full expression
- a
")"
class $GROUP extends $AST {
static SHAPE = new Shape("(", () => $EXPR, ")");
}This tells the parser: “wrap another expression inside parentheses.”
Notice also the use of an arrow function () => $EXPR, because we haven't defined $EXPR yet (we will in the next section), we can lazily access it witht the arrow function. This helps when you have interdependent expressions like in the case of $GROUPand$EXPR
We’ve built multiple AST node types: $NUMBER, $ADD, $MULTIPLY, and $GROUP. Now we create a top-level node that tries them all.
class $EXPR extends $AST {
static SHAPE = new Shape([$GROUP, $ADD, $MULTIPLY, $NUMBER]);
}This means $EXPR will try matching:
- A group like
(1 + 2) - An addition like
1 + 2 - A multiplication like
2 * 3 - A plain number like
42
Panda Parse will try each one in order and return the first successful match.
Now you can parse things like:
$EXPR.parse(new Lexer("3 + 4")); // Addition
$EXPR.parse(new Lexer("2 * 5")); // Multiplication
$EXPR.parse(new Lexer("(1 + 2)")); // Grouped expression
$EXPR.parse(new Lexer("(1 + 2) * 3")); // But wait... what about this?Panda Parse parses expressions in the order you define them — so if $ADD comes before $MULTIPLY, it will match that first. It doesn’t handle operator precedence unless you design it to.
To handle real operator precedence (like * before +), you’ll need to:
- Create multiple expression layers (e.g.
$TERM,$FACTOR) - Parse based on priority
You now have a working expression parser that supports:
- Numbers:
42 - Addition:
1 + 2 - Multiplication:
3 * 4 - Grouping:
(1 + 2) - Chaining:
1 + 2 + 3
With this foundation, you can:
- Add new operators (
-,/,^,&&, etc.) - Add functions:
sum(1, 2) - Add variables or identifiers:
x + y * z
Panda Parse allows you to repeat a single shape element multiple times using { min, max } options.
This is useful for matching lists, sequences, or repeated patterns with control over how many times they must appear.
You can pass an options object directly after a shape term in your Shape definition:
new Shape(Term, { min: 1, max: 5 });This tells the parser:
- Try to match
Termrepeatedly - Match at least 1 time
- Match at most 5 times
If fewer than min matches occur, the parse will fail. If more than max matches are found, the parser will stop consuming after max matches.
class $LIST extends $AST {
static allowIncompleteParse = true;
static SHAPE = new Shape("[", $NUMBER, { min: 1, max: Infinity }, "]");
}This shape matches:
- an opening bracket
[ - one or more
$NUMBERnodes - a closing bracket
]
[1]
[1 2 3]
[10 20 30 40 50]
[]
[ ]
Because min: 1 requires at least one $NUMBER inside the brackets.
- This syntax works for any shape element, whether it's a regex, string, or AST class.
- You can also use this to enforce exact counts (e.g.
{ min: 2, max: 2 }requires exactly two). - Repeated elements are parsed in sequence — back-to-back — until the limit is reached or a non-matching token appears.
Panda Parse allows for flexible matching, especially useful in live coding environments, REPLs, or when building interactive tools like editors and validators.
These two static fields can be set on any $AST subclass to enable partial parsing:
static allowIncompleteParse = true;If enabled, the parser will accept a partially matched node — even if not all parts of the SHAPE succeed — as long as the threshold (below) is met.
This allows you to parse incomplete or in-progress code like:
1 +
or:
border:
without crashing or failing the parse.
static incompleteParseThreshold = 2;This defines the minimum number of shape elements that must be matched for the parse to be considered valid.
class $EXAMPLE extends $AST {
static allowIncompleteParse = true;
static incompleteParseThreshold = 2;
static SHAPE = new Shape($A, $B, $C, $D);
}- If
$A,$B,$C, and$Dall match: ✅ accepted - If only
$Aand$Bmatch: ✅ accepted - If only
$Amatches: ❌ rejected (threshold not met)
This is especially useful for deeply nested or long shapes where partial progress is still meaningful.
This system is great for:
- Live feedback while typing
- Graceful fallback on broken code
- Building resilient parsers for editors
- Supporting incomplete input without special cases
You can combine this with .fallbackToFirstExp for even more intelligent error handling or graceful degradation.
static fallbackToFirstExp = true;This tells Panda Parse:
“If this node fails to match fully, return the first successfully parsed subcomponent instead.”
$AST is the base class for all syntax tree nodes in Panda Parse. You extend it to define new language constructs and parsing rules using declarative SHAPE definitions.
class $NUMBER extends $AST {
static SHAPE = new Shape(/^\d+/);
}Then you can parse using:
const ast = $NUMBER.parse(new Lexer("42"));Identifies this class as a valid AST node.
Defines the grammar rule for this node using a Shape object.
Allows the node to match partially parsed inputs (see below).
Minimum number of successful components required when allowIncompleteParse is enabled.
If the node fails to fully parse, fallback to the first successfully parsed expression.
new MyAST({ exps, ...rest });Called internally by .parse() to construct a node with child expressions.
exps– array of parsed sub-expressions (ASTs or Tokens)- Any other fields passed via
...restare stored on the instance
All expressions (both ASTs and Tokens) parsed by the shape.
Filtered version of exps — includes only:
- AST nodes
- Tokens that are not whitespace
All tokens (flat array), including whitespace and those nested in child ASTs.
Only non-whitespace tokens.
Only whitespace tokens.
The full matched text string from all tokens.
The absolute start and end character offsets of the AST on the original input line.
The zero-based line index of the first token.
The column position (in the line) of the first token.
Returns all visible tokens within a given line range, including metadata for highlighting.
Parses a node from a given Lexer instance.
- An instance of the AST subclass
nullif parsing fails
Internally, it iterates over the class's SHAPE, collecting tokens or nested ASTs.
Handles:
- fallback to first expression (if enabled)
- incomplete parse tokens (when
allowIncompleteParseis set) - token-level caching and cursor restoration
The Lexer is responsible for turning a raw string into a stream of tokens. It provides the foundational input mechanism for parsing in Panda Parse. Each AST node uses the lexer to inspect, match, and consume parts of the input string.
const lexer = new Lexer(str);str(string) – the input string to tokenize and parse.
const lexer = new Lexer("42 + 7");The full original input string.
The current position (index) in the input string.
Returns true if there’s more text to parse (i.e. cursor < str.length).
Returns everything that has been parsed so far:
lexer.parsedStr; // str.slice(0, cursor)Returns the remaining unparsed string:
lexer.unparsedStr; // str.slice(cursor)Saves the current cursor position to a stack.
Restores the last saved cursor position from the stack.
Use this to backtrack safely during complex parsing logic.
Returns the current line index (zero-based) based on cursor position.
Returns the column number (character offset in the current line).
Returns the absolute start index of the given line.
Returns the absolute end index of the given line.
Returns the number of leading spaces in the given line.
Returns indentation level of the current line.
Convenient versions of the above, but for the current line.
Retrieves a previously stored cached result by key.
Stores a result at a given position with a custom name.
Useful for memoizing results in recursive or repeated patterns.
Simulates matching the given pattern without consuming it.
patterncan be a string or RegExp.- Advances an internal
tasteCursorif matched. - Returns:
{ value }if successful,nullif not.
Attempts to match and consume the given pattern from the input.
- Returns a
Tokenif successful. - Advances the main
cursor. - Returns
nullif the pattern doesn't match.
const lexer = new Lexer("hello world");
lexer.eat("hello"); // ✅ matches
lexer.eat("world"); // ❌ fails — cursor is now after "hello"
lexer.eat(/\s+/); // ✅ matches the space
lexer.eat("world"); // ✅ now matchesReturns true if x is a valid lexing target (a string or RegExp).
Returns the line numbers that intersect with a character range.
When eat() successfully matches, it returns a Token object with:
{
type, // the pattern used to match (string or RegExp)
value, // the matched string
start,
end, // character positions
line,
col, // line/column position info
indent, // indentation level of line
paddingLeft,
paddingRight; // reserved for future styling
}The Lexer provides:
- Cursor-based string scanning
- Line and column tracking
- RegExp and literal matching
- Optional lookahead (
taste) and consumption (eat) - Memoization through caching
- Precise token-level control for building ASTs
It’s the foundation for the Panda Parse parsing pipeline.