-
-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chapter 6 edits #73
Chapter 6 edits #73
Conversation
chapters/6-CPU-Features-For-Performance-Analysis/6-7 Precise Event Based Sampling (PEBS).md
Show resolved
Hide resolved
@@ -1,14 +1,14 @@ | |||
### TMA On ARM Platforms | |||
|
|||
ARM CPU architects also have developed a TMA performance analysis methodology for their processors, which we will discuss next. ARM calls it "Topdown" in their documentation [@ARMNeoverseV1TopDown], so we will use their naming. At the time of writing this chapter (late 2023), Topdown is only supported on cores designed by ARM, e.g. Neoverse N1 and Neoverse V1, and their derivatives, e.g. Ampere Altra and AWS Graviton3. Refer to the list of major CPU microarchitectures at the end of this book if you need to refresh your memory on ARM chip families. Processors designed by Apple don't support the ARM Topdown performance analysis methodology yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what you have here works, but it violates a very common idiom and sounds weird
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please elaborate on what is wrong and suggest how I can change it.
chapters/6-CPU-Features-For-Performance-Analysis/6-4 TMA-ARM.md
Outdated
Show resolved
Hide resolved
Thanks. Will use
Good to know.
Ok, I think I meant section 4.9
It's been a long time since I looked at it, but |
@@ -1,14 +1,14 @@ | |||
### TMA On ARM Platforms | |||
|
|||
ARM CPU architects also have developed a TMA performance analysis methodology for their processors, which we will discuss next. ARM calls it "Topdown" in their documentation [@ARMNeoverseV1TopDown], so we will use their naming. At the time of writing this chapter (late 2023), Topdown is only supported on cores designed by ARM, e.g. Neoverse N1 and Neoverse V1, and their derivatives, e.g. Ampere Altra and AWS Graviton3. Refer to the list of major CPU microarchitectures at the end of this book if you need to refresh your memory on ARM chip families. Processors designed by Apple don't support the ARM Topdown performance analysis methodology yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please elaborate on what is wrong and suggest how I can change it.
Some notes:
Listing 6-2: please don't cast the return value of
malloc()
. this is a comp.lang.c FAQ: https://c-faq.com/malloc/mallocnocast.html .If the code is C++, it shouldn't be using
malloc()
.6.1.2: the reason there are no branch mispredictions in the SHA code is because cryptography code must carefully guard against dynamic branching/cache behavior to defend against timing attacks. fascinating stuff. check out DJB's papers.
6.1.4 (pg 108) is section 4.11 really the one you want to reference here? i'm not sure...?
recommendations to check things with
dmesg
are pretty bad imho.dmesg
dumps a ring buffer into which the kernel prints over its lifetime. different output settings can filter messages from hitting there. what you almost certainly want iscpuid
,cat /proc/cpuinfo
, orrdmsr
. furthermore,dmesg
output is in no way a stable api, and it's not available to regular users depending on sysctls.