-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed parsing refactor #3420
Comments
OK, after diving into an actual refactor, I realized/remembered that there are already quite several parsing configuration settings (many of them to do with tokenization, actually) exposed as properties of the With a number of new proposed configuration parameters in the works here and #3374, these considerations lead to the following possible organization alternatives for all of the configuration parameters, which do seem to fall into two families, those for parsing (e.g. existing parse.isAlpha, config.number, and proposed operators table) and those for computation (e.g., existing config.relTol, config.predictable, proposed config.numberApproximate).
As the most similar to the current state of the world, I will pursue (3) for now, even though it's a bit inconsistent as far as the mechanisms for configuring parsing and computation go and I don't at the moment see how to do it without a breaking change. But if you prefer any of the other options or some other configuration scheme, please just let me know and I will be happy to switch gears. |
Describe the concerns
There are a number of outstanding parsing/evaluation issues, such as #2631, which it would be convenient to work on in conjunction with #3374 (in particular the implementation of the
quotient
operator to be represented also by infix//
ala Python -- that needs some specialized TeX representation, and right now it is tricky to to make that TeX handling uniform betweenquotient(a,b)
anda // b
). These items generally point to unifying FunctionNode and OperatorNode -- so I am planning to go ahead with that, presuming you are still willing on that, and I was thinking of calling the unified node a CallNode. (It's easiest to use a third name to make sure all instances of each have been handled in the unification.)Anyhow, working on the unification of FunctionNode and OperatorNode led me to realize that there is currently redundancy between the big table in src/expression/operators.js and a number of small tables in src/expression/parse.js, such as the one in the (somewhat awkwardly named) function parseMultiplyDivideModulusPercentage. In addition, although there is a mechanism for adding custom nodes to the parser, it is not particularly well documented, and it is limited in scope: basically the custom nodes can only be created by syntax that looks like a function call. Thus, for example in our color computations, we would like to make expressions like
#FF80C0
be color constants; this doesn't really interfere with using the#
for comments, because you just need to avoid having 6 or 8 hex digits immediately after the#
for a comment, and it reads very naturally. But right now there is no way to extend the parser to handle such things.Proposal
operators
, directly in theconfig
object? Or should we have a separate exposedparseConfig
object, parallel toconfig
that is just concerned with parsing? If so, shouldnumber
andnumberFallback
properties move toparseConfig
since they really only concern parsing expressions, so thatconfig
can deal with computation andparseConfig
with parsing? Note that having the table of operators in the configuration would resolve Facilitate defining operator synonyms such as && for 'and', || for 'or' #2722, for example -- someone wanting to parse&&
asand
would just need to insert the appropriate entry in the table for logical operators before callingparse
.config
orparseConfiguration
as well, so that syntax extensions like#FF80C0
for a color constant can be supported.parseAddSubtract
, into the operators.js-style table, with an automatic facility to call the function at one higher level of precedence to get the subexpressions of the current precedence level; this additional refactor would facilitate run-time tweaking of the parser, easing development of parsing improvements and enabling other unforeseen syntax extensions on the part of clients of mathjs, beyond just the current custom nodes facility.EXPR if COND else ALTERNATE
toCOND ? EXPR : ALTERNATE
, you could add an entry to the table just before the one for the ternary that recognizes this alternate syntax, but just returns a ConditionalNode.I am going to embark on these refactors, and let you know how it goes; I welcome feedback in the meantime. Design question: do you consider the unification of OperatorNode and FunctionNode to CallNode a breaking change in and of itself, because mathjs does document and expose its list of valid node types? If so, would you like this refactor, presuming it is OK with you, to go in two steps, one that doesn't unify the nodes but does make the parser more DRY and configurable, followed by one that does unify the nodes that would have to go into mathjs 15?
Thanks for your thoughts on this.
The text was updated successfully, but these errors were encountered: