Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve error for parsing from a nonexistent lhs #128

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

steinarvk
Copy link

I ran into this when I renamed a production rule. There was previously no specific error message, it just kept going until it hit a not-very-helpful exception in the guts of the parsing code.

@Engelberg
Copy link
Owner

This is a good idea. Right now, if you create a parser where a non-terminal on the right-hand side isn't on the left, this is flagged as an error (at parser creation time) by the following helper function, which is called on all the different ways to create a parser:
https://github.com/Engelberg/instaparse/blob/master/src/instaparse/cfg.clj#L246

So that means this is only an issue when the user explicitly selects a new :start production and it is invalid. Instaparse allows you to declare a custom :start production either when creating the parser, or when doing a parse. Your version picks up both scenarios, but at parse time. The downside of this is that when you create the faulty :start production when creating the parser, you're not getting immediate feedback, also you're paying the (small) penalty of checking the start production in the most common case, which is that the production is the default lhs of the first production and is therefore known to be valid.

For this reason, I'm thinking the check should occur wherever we process the :start production.

So that would mean, calling the check on the built parser in parser (if a :start production was specified), and calling the check on the grammar map and the :start production in parse and parses (if a :start production was specified).

Also, when you check to determine that the start is among the left-hand side, it might be faster to do (some #{start} (keys grammar-map)) because this short-circuits as soon as you find the start (and we're expecting it to most likely be in there), versus building a set which necessarily means traversing all the keys, just to test against the set once.

If you're not in a hurry to have this changed, I'm willing to write it this way next time I have a chance to work on instaparse.

@steinarvk
Copy link
Author

I'd be happy to try to adjust the patch to your comments, they sound sensible. You obviously know the design better.

Would you be willing to commit a file called COPYING or LICENSE file in the root directory of the repo containing the text of the EPL? That'd make it easier for me to contribute. E.g. like leiningen: https://github.com/technomancy/leiningen/blob/master/COPYING

@steinarvk
Copy link
Author

Hey, I've got an adjusted patch for you to review that essentially does what you described.

Unfortunately, I'm being disallowed from contributing unless a LICENSE file is added to the project with the text of the EPL as described above.

I dislike putting myself in the rather ridiculous position of coming in as a random outsider and holding a trivial ten-line error message patch hostage over your project conforming to a third party's ideas about naming of licensing files, but these are games some legal departments apparently like to play. Sorry!

@aengelberg
Copy link
Collaborator

@steinarvk There is now a LICENSE file in this repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants