Can ANTLR meet these requirements? #4006
Unanswered
The-Futurist
asked this question in
Q&A
Replies: 1 comment
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
This is related to an earlier discussion last year, here: #3321
Since then I've done more research and started to define the grammar informally, but it is certainly based on the PL/I grammar (Subset G, ANSI X3.74-1987, the standard is a very high quality document but isn't freely available sadly, I have a printed copy). That last discussion left me a bit confused as some issues were raised by others as they explored Rexx, so I never quite got clarity on whether ANTLR can parse this or not.
Here's an example of some simple legal PL/I code that PL/I compilers can parse:
etc, this gives an idea. If that sample can be parsed then that would likely prove that ANTRL can do this.
Question 1 - Lexical Analysis
Anyway, at the lexical level I'm interested in supporting numeric literals that can contain separator spaces (not simply underscores as is common in many languages).
e.g, these tokens would look like, starting with trivial cases that are simply regular:
I've scraped together code (hand crafted FSM based lexer) that can recognize these but It's a little messy and I don't want to create messy code really.
Cases like "DEF ABC:h" present a challenge because that scans into
<Identifier> <Identifier> <Colon> <Identifier>and that must be converted eventually, to<NumericLiteral>, with say a lexeme of "DEFABC:h".I have added a "hacked" layer into the tokenizer that accumulates tokens looking for the pattern for the literal and when/if it sees that, it discards the accumulated tokens and creates and returns the desired
<NumericLiteral>token else it just returns the original token seqeuence.If ANTRL has the power to represent this kind of thing, I will likely want to adopt that for the lexical analysis at least.
Question 2 - Grammar
A separate question is can ANTRL handle the language grammar that has no reserved words, the way PL/I parser's have done this is by having parser that:
<reference>=<expression>).I've written such a parser in the past, a hand crafted recursive descent, but have no idea if such a grammar can be represented in systems like ANTLR.
The parser doesn't see (nor does the lexer generate) "Keyword tokens". All keywords are returned as
<Identifier>and have a bool property stating if it is or is not also a keyword. The parser then treats these identifiers as either true identifiers used in expressions and so on, or as actual keywords, as it parses, that decision is contextual.Beta Was this translation helpful? Give feedback.
All reactions