refactor: systematize and expose the parsing configuration #3423

gwhitney · 2025-03-18T21:19:16Z

I thought it was wisest for the sake of evaluating #3420 to provide a taste of the refactor. So I am opening the corresponding pull request "early" and will mark it as a work in progress. So far, this is primarily just a refactor, to systematize and clarify the tokenizing code. But it does also expose the list of token scanners as a configuration property on the parse function, so it already enables additional functionality -- for example, adding a token type like #FF0080 for color constants (our use case at the moment, and I've put a whole demo of this facility in parse.test.js). The remaining idea is to do the same for the parse table -- in particular, the list of delimiters would come from the big table in operators.js, rather than being redundantly repeated, and the table would become decorated with the parsing functions for the corresponding level of precedence. That would easily allow extension by addition of new operators, at either existing or new levels of precedence (either by adding to one of the entries in the operators table, or by splicing in a new entry between existing ones). If you added a // operator, to take an example I am working on, you would just need to add to the operators table, and not have to remember to add it to some list of delimiters as well, etc.

So that's the idea. I will take a pause so you can have a chance to look at the PR so far and provide feedback and/or your encouragement or discouragement of continuing in this vein. I think you will find, for example, that the tokenization code is now much more transparent, since it has been factored into several much smaller scanner functions which are just run in sequence, each looking for one particular type of token.

(I guess what I would really like to do is rewrite the parser in some popular parser grammar package, but that would be beyond the level of attention I can devote, for now anyway.)

Resolves #3422.

Including a test that implements a custom token type `#HEXHEX` that could be used for color constants.

gwhitney · 2025-03-19T16:41:55Z

OK, rebased after #3425, which was a localized fix for #3422.

gwhitney · 2025-03-19T19:54:12Z

Amusingly I just noticed that GitHub automatically put a color dot after the example notation I used, supporting the idea that it's good to enable such a syntax extension. I do think that there might be some value in our project publishing our Chroma and number-theory extensions to mathjs as add-on packages, if you have any thoughts/ideas about where/how to most effectively publish such things. (And in fact, if we were to set up such a collection of add-ons, we might well want to break out some of the existing groups, like statistics or sets, etc, as separate add-ons, to reduce the "monolithicness" of mathjs.)

gwhitney marked this pull request as draft March 18, 2025 21:23

This was referenced Mar 19, 2025

variable named E blocks .-operators #3422

Closed

Surprising parse results from binary and octal constants #3421

Open

fix: #3422 parse dot operators after an implicit multiplication with symbol E #3425

Merged

gwhitney added 7 commits March 19, 2025 09:36

refactor: make ParseState its own class in its own file

dd2963e

refactor: remove NAMED_DELIMITERS to prep for using properties

5f2426c

refactor: fold reading comments and skipping characters into ParseState

692ebf4

doc/test: Document options and add tests

4da330e

Including a test that implements a custom token type `#HEXHEX` that could be used for color constants.

doc: comment the token scanner object where it appears

22e83d5

test: add test for josdejong#3422

35c9518

test: consolidate after rebasing on josdejong#3425

6dbd02f

gwhitney force-pushed the refactor/parse_config branch from 7ee5387 to 6dbd02f Compare March 19, 2025 16:41

gwhitney mentioned this pull request Mar 19, 2025

Tests for examples in embedded docs #3413

Open

gwhitney mentioned this pull request Mar 19, 2025

Evaluating 'help(not)' throws SyntaxError #1905

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: systematize and expose the parsing configuration #3423

refactor: systematize and expose the parsing configuration #3423

gwhitney commented Mar 18, 2025 •

edited

Loading

gwhitney commented Mar 19, 2025

gwhitney commented Mar 19, 2025

refactor: systematize and expose the parsing configuration #3423

Are you sure you want to change the base?

refactor: systematize and expose the parsing configuration #3423

Conversation

gwhitney commented Mar 18, 2025 • edited Loading

gwhitney commented Mar 19, 2025

gwhitney commented Mar 19, 2025

gwhitney commented Mar 18, 2025 •

edited

Loading