-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve document about the absence operator #87
Comments
First, I'd like to translate the important points of his paper very very roughly.
Complement sets are useful to express C-style comments, CR LF terminated lines,
Regular expressions consist of:
A regular expressions engine can be implemented with DFA or backtracking. 3.1. Abstract Syntax Tree of regular expressions 3.2. A basic regular expressions engine A basic implementation by Ruby. # re: AST of a regex
# str: An array of characters
# pos: Start position
# block: A block executed when it is matched
# (Callback)
def try(re, str, pos, &block)
If a regex engine uses DFA, complement sets can be handled easily.
(Note: Onigmo actually uses (?~r) instead of !r, because !r is not compatible
This is also described in pp.26-27 of his slide:
7.1. Easy to write C-style comments can be expressed with the following:
("a" and "b" are used here instead of "/" and "*" to reduce the complexity of If the absent operator is used, it can be:
7.2. Fit with regular language theory 7.2.1. Repetition of lazy match
This doesn't work well when concatenated with other regex. E.g.
This wrongly matches
this correctly matches 7.2.2. No backtracking
This works well when concatenated with
However this hardly depends on the strategy of backtracking, and it doesn't 7.2.3. Negative lookahead
This works well. Note: I'm not sure that the negative lookahead is regular or not. 7.3. Inefficiency of complement sets It is possible to implement a negation operator which matches the complement
8.1. Ragel Ragel has the following operators:
8.2. Perl 8.3. Grail
|
And partial translation of his slide. p.26
p.27
|
BTW, should the name of the operator be "absence operator" instead of "absent operator"? |
Good work! I'd help translate them if needed. |
I think so. "absent operator" is a confusing name, because it could just as well refer to an operator that is not there. "absence operator" does not have this problem. On the other hand, would you say that it is also a group? In the docs you have put it under "7. Extended groups". If so, it might be more consistent to use the adjective form, as with "passive" or "atomic". I'd still choose clarity over consistency, though. |
Thank you for the explanation.
In Akira's paper, |
Rename absent operator to absence operator. Add more description.
I have slightly updated the document: 7911409 |
(Upvoting all comments is meaningless, I think...) |
After the release of Ruby 2.4.1, some blog posts about the absent operator were written, and they caused some discussions.
Blog posts:
Discussions:
Unfortunately, some people might not understand the advantage of the absent operator. Maybe one of the reason is that the original paper by Tanaka Akira is written in Japanese. Another reason would be that the document of the operator in Onigmo is not enough.
The text was updated successfully, but these errors were encountered: