-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some things that crash or give the wrong result #4
Comments
@NicholasSterling, please create a pull request against the main branch. |
Done. The PR does not attempt to fix any of the errors, just suggested simplifications. By the way, the lexer converts the UTF-8 String to a Vec and then processes the characters one at a time, from the beginning, removing each character as it goes. Removing a character from the front of a Vec requires moving the rest of them, so processing an N-char input is N^2. |
@vishpat I am pretty sure you can fix the N^2 lexing by getting an Iterator for the input string and calling peekable() so that you can peek at the next element. I could do this for you, but since my other PR is just sitting there and you are making further changes to the lexer, I'll wait for you to resolve all of that. If you decide to do this yourself, I'm suggesting getting rid of the Vec in tokenize() and doing
Then when you need characters, you can do
If you want me to do it, just give me the all-clear when the lexer dust has settled. |
|
@NicholasSterling consider the lexer dust settled. Please open a PR |
@vishpat OK, thanks -- will do so by the end of the weekend. |
OK, that was too optimistic. I'll give an update in a few days. |
I'm overdue for an update. I decided to do something more sophisticated, with these goals:
I looked in crates.io for something that would help, and Tokenator looked promising, but it looked a bit clumsy to use, so I decided to do it a bit differently. I have been working on a new crate that helps one write tokenizers. It appears to be working, but I need to do more testing. |
@vishpat After much thrashing over the API, the tokenizer crate I have been working on (token-iter) is ready. If you are OK with it, I would like to make lisp-rs use it, let you check it out, and then publish the crate. If you would rather not use a separate crate for tokenization, I understand, but please let me know either way. https://github.com/NicholasSterling/token-iter/blob/main/src/lib.rs |
@vishpat Have you had a chance to look at the token-iter crate? Are you OK with depending on an external crate for lexical analysis? |
I love this project, and very much appreciate the effort that went into it. Kudos!
I'm not sure whether this is intended for actual use or just a simple teaching exercise, but I thought I'd mention a few things that failed (for some definition of "fail", e.g. crash or do something unexpected).
12 -- expects an outer set of parens
(4) -- evals to (4), not 4; (+ 3 1) evals to 4
(print "foo) -- prints foo)
(") -- evals to ())
Are you open to pull requests? There are also a few things that could be simplified.
The text was updated successfully, but these errors were encountered: