-
Notifications
You must be signed in to change notification settings - Fork 10
Add spans #119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add spans #119
Conversation
|
neat, I'll try to review this on the weekend. do you have a usecase for this or just did it out of curiosity? |
|
I want to extract complete subtrees of the given html as-is. Additionally, being able to give better error messages with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one comment, i think i found the perf issue.
| } | ||
| } | ||
|
|
||
| fn try_read_string(&mut self, s: &[u8], case_sensitive: bool) -> Result<bool, Self::Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since read_until is not implemented, the SpanReader will not perform well for large contiguous blocks of text such as <p>aaaaa.... aaaaa</p>
I wonder if instead of having a span reader that tracks its position, changes to the position could be emitted as "events" similar to start tags, etc:
trait Emitter {
fn advance_position(&mut self, offset: usize) {}
}
if the method is empty, then the compiler should be able to optimize away all overhead
i think it might be simpler, but i am also fine with this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not find a way to implement read_until for SpanReader since I made it also return &'b Self (to fix some borrowing issues in machine).
is that overhead added even to use without spans? I think this is a real problem (i can't/won't want to pay for spans when i don't need them myself), but not sure why this could happen. after all you mostly just added generics and reader args |
|
I tried a different implementation, based on your suggestion. I'm not sure which approach I like more, what is your opinion on that? Something, which maybe should be changed is to merge The last commit adds a
This are the perf results without spans. |
This comment was marked as outdated.
This comment was marked as outdated.
|
I re-ran the benchmarks and I am not sure how i concluded there is a massive perf regression. forget about my previous comment. i like add-spans-2 more than this PR because the breaking changes to emitter trait are gone with the exception of
I see a 100-instructions regression on data_state_10: from 3109 instructions on main to this on toggleI'd be ok with a PR like add-spans-2. maybe you want to make a double-take on what the breaking changes are there, but i don't see any. it would also be cool to try and understand where this really minor perf regression comes from. for this i believe one would have to diff the ASM to really understand what's going on. but it's optional for merging IMO, because i do not see a perf regression in my own usage (https://github.com/untitaker/hyperlink) |
|
We can go forward with add-spans-2. Should I force push this PR or create a new one?
--- failure constructible_struct_adds_field: externally-constructible struct adds field ---
Since I introduced `ForwardingEmitter` I'm fine with giving `start_open_tag` a default body (because else using the default implementation breaks the spans).
(I pushed a commit to add-spans-2 and another commit from the change I wanted to do as well)
Description: Failed in: --- failure trait_method_added: pub trait method added --- Description: Failed in: --- failure trait_method_parameter_count_changed: pub trait method parameter count changed --- Description: Failed in: --- failure trait_requires_more_generic_type_params: trait now requires more generic type parameters --- Description: Failed in: --- failure type_requires_more_generic_type_params: type now requires more generic type parameters --- Description: Failed in: On my machine, |
|
Hey, I don't want to annoy you, I just want to ask briefly how/ if we can move this PR (or a new one) forward :) |
|
@Akida31 i think we can go forward with add-spans-2, don't care if you make a new PR or reuse this one. I don't mind the regressions |
|
superseded by #120 |
This adds, among other things, a reader parameter to all emitter methods which can be used by
CallbackEmitterto get spans from the reader.So this is definitely a breaking change :/
You can find some more unsquashed history at https://github.com/Akida31/html5gum/tree/add-spans-wip.
Fixes #10
I don't know whats up with the perf, but here are the results on my machine:
Perf before
Perf after