-
Notifications
You must be signed in to change notification settings - Fork 34
An alternative symbol tokenizer and lexer using an Aho-Corasick automaton #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@pombredanne is there a straighforward way to use your tokenizer with boolean.py ? I checked the code but didn't understand how to use it - my most important requirement is to support hyphens "-" in symbols names e.g. |
I don't know if you looked at the I don't expect you can use it directly with the built-in classes in |
(Yes, it's one of those packages with a ton of code hidden in its |
I realized later on that I could specify the list of extra characters
accepted in tokens, and that was enough for my needs, luckily.
…On Wed 22. May 2024 at 1.04, Frank Dana ***@***.***> wrote:
(Yes, it's one of those packages with a ton of code hidden in its
__init__.py. 10 demerits to @pombredanne <https://github.com/pombredanne>
for that. 😉 )
—
Reply to this email directly, view it on GitHub
<#62 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACI3S5MBIOWYEMM5TQWNVCTZDPHHJAVCNFSM4DBHIZ6KU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJSGM2TMOJTG43Q>
.
You are receiving this because you were mentioned. hi Message ID:
***@***.***>
|
This is not a bug but a suggestion. I implemented an alternative expression tokenizer using an Aho-Corasick automaton. This allows to lex against a list of known symbols and resolve expressions that would be otherwise ambiguous and may help other boolean.py users.
You can find it there: https://github.com/nexB/license-expression/tree/56709dc901c97a3722247d4e4158e95594fd69a1/src/license_expression
The text was updated successfully, but these errors were encountered: