PT trace only the executable segment of the main object.#1714
Merged
ltratt merged 1 commit intoykjit:masterfrom May 31, 2025
Merged
PT trace only the executable segment of the main object.#1714ltratt merged 1 commit intoykjit:masterfrom
ltratt merged 1 commit intoykjit:masterfrom
Conversation
Open
Contributor
Author
|
Because this has been a long-lived branch, I've been doing many merge commits. Do you mind if I force push a freshly rebased changes to make it easier to review? |
Contributor
|
Please force push. |
deltablue (benchmark that spends a lot of time mapping): - time spent mapping reduced in the order of ~10x - overall performance improved about 15% This change means that we can no longer spot longjmp in code outside of the traced range. Whereas before, we'd spot calls to longjmp when the PT decoder was disassembling foreign code, now what we do is do some basic "control flow integrity" checking in the trace builder, checking that each mappable block is either: the entry block of a function, a return block from a function, or a static successor of the last mappable block we processed. (Annoyingly this means we have to keep track of the previous block ID (which may be unmappable) *and* the previous mappable block ID separately -- although Lukas and I think we could probably do away with unmappable blocks and thus also the former, but that's another story) Although to detect longjmp this way we would only need to check this property when we return from a function, it turns out to be simpler to implement for every mappable block (return or not) and arguably it's a really good sanity check that we should probably have had in place from the start. An added bonus is that this approach is tracer-agnostic, so the longjmp tests work for software tracing too. I had hoped to detect signal handlers this way too, but unfortunately I can't see an universally unambiguous way to distinguish them from regular calls. Since signal handlers weren't really the aim of this PR, I won't let this hold us up. For now I've added an ignored test showing the issue: currently signal handlers that have IR get inlined.
Contributor
Author
|
Force pushed and updated the PR description. Requires: ykjit/ykllvm#263 |
Contributor
Author
|
Assigning @ptersilie too, as he may have opinions on the trace builder parts. |
Contributor
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
deltablue (benchmark that spends a lot of time mapping):
This change means that we can no longer spot longjmp in code outside of
the traced range. Whereas before, we'd spot calls to longjmp when the PT
decoder was disassembling foreign code, now what we do is do some basic
"control flow integrity" checking in the trace builder, checking that
each mappable block is either: the entry block of a function, a return
block from a function, or a static successor of the last mappable block
we processed.
(Annoyingly this means we have to keep track of the previous block ID
(which may be unmappable) and the previous mappable block ID
separately -- although Lukas and I think we could probably do away with
unmappable blocks and thus also the former, but that's another story)
Although to detect longjmp this way we would only need to check this
property when we return from a function, it turns out to be simpler to
implement for every mappable block (return or not) and arguably it's a
really good sanity check that we should probably have had in place from
the start.
An added bonus is that this approach is tracer-agnostic, so the longjmp
tests work for software tracing too.
I had hoped to detect signal handlers this way too, but unfortunately I
can't see an universally unambiguous way to distinguish them from
regular calls. Since signal handlers weren't really the aim of this PR,
I won't let this hold us up. For now I've added an ignored test showing
the issue: currently signal handlers that have IR get inlined.