Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chapter 6 edits #73

Merged
merged 11 commits into from
Sep 24, 2024
Merged

Conversation

dankamongmen
Copy link
Contributor

Some notes:

  • Listing 6-2: please don't cast the return value of malloc(). this is a comp.lang.c FAQ: https://c-faq.com/malloc/mallocnocast.html .
    If the code is C++, it shouldn't be using malloc().

  • 6.1.2: the reason there are no branch mispredictions in the SHA code is because cryptography code must carefully guard against dynamic branching/cache behavior to defend against timing attacks. fascinating stuff. check out DJB's papers.

  • 6.1.4 (pg 108) is section 4.11 really the one you want to reference here? i'm not sure...?

  • recommendations to check things with dmesg are pretty bad imho. dmesg dumps a ring buffer into which the kernel prints over its lifetime. different output settings can filter messages from hitting there. what you almost certainly want is cpuid, cat /proc/cpuinfo, or rdmsr. furthermore, dmesg output is in no way a stable api, and it's not available to regular users depending on sysctls.

@@ -1,14 +1,14 @@
### TMA On ARM Platforms

ARM CPU architects also have developed a TMA performance analysis methodology for their processors, which we will discuss next. ARM calls it "Topdown" in their documentation [@ARMNeoverseV1TopDown], so we will use their naming. At the time of writing this chapter (late 2023), Topdown is only supported on cores designed by ARM, e.g. Neoverse N1 and Neoverse V1, and their derivatives, e.g. Ampere Altra and AWS Graviton3. Refer to the list of major CPU microarchitectures at the end of this book if you need to refresh your memory on ARM chip families. Processors designed by Apple don't support the ARM Topdown performance analysis methodology yet.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what you have here works, but it violates a very common idiom and sounds weird

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please elaborate on what is wrong and suggest how I can change it.

@dendibakh
Copy link
Owner

Some notes:

Thanks. Will use new instead of malloc.

  • 6.1.2: the reason there are no branch mispredictions in the SHA code is because cryptography code must carefully guard against dynamic branching/cache behavior to defend against timing attacks. fascinating stuff. check out DJB's papers.

Good to know.

  • 6.1.4 (pg 108) is section 4.11 really the one you want to reference here? i'm not sure...?

Ok, I think I meant section 4.9

  • recommendations to check things with dmesg are pretty bad imho. dmesg dumps a ring buffer into which the kernel prints over its lifetime. different output settings can filter messages from hitting there. what you almost certainly want is cpuid, cat /proc/cpuinfo, or rdmsr. furthermore, dmesg output is in no way a stable api, and it's not available to regular users depending on sysctls.

It's been a long time since I looked at it, but cpuid may not be enough. It will show you static information (decoded from CPUID), but dmesg will also show if it is supported by the kernel. I didn't change it from the first edition. I think I borrowed it from some docs that recommended to check using dmesg. But yes, for long-running systems, required messages may not be in the buffer. That's a problem.

@@ -1,14 +1,14 @@
### TMA On ARM Platforms

ARM CPU architects also have developed a TMA performance analysis methodology for their processors, which we will discuss next. ARM calls it "Topdown" in their documentation [@ARMNeoverseV1TopDown], so we will use their naming. At the time of writing this chapter (late 2023), Topdown is only supported on cores designed by ARM, e.g. Neoverse N1 and Neoverse V1, and their derivatives, e.g. Ampere Altra and AWS Graviton3. Refer to the list of major CPU microarchitectures at the end of this book if you need to refresh your memory on ARM chip families. Processors designed by Apple don't support the ARM Topdown performance analysis methodology yet.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please elaborate on what is wrong and suggest how I can change it.

@dendibakh dendibakh merged commit 9feb18e into dendibakh:main Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants