Chapter 6 edits #73

dankamongmen · 2024-09-11T19:25:25Z

Some notes:

Listing 6-2: please don't cast the return value of malloc(). this is a comp.lang.c FAQ: https://c-faq.com/malloc/mallocnocast.html .
If the code is C++, it shouldn't be using malloc().
6.1.2: the reason there are no branch mispredictions in the SHA code is because cryptography code must carefully guard against dynamic branching/cache behavior to defend against timing attacks. fascinating stuff. check out DJB's papers.
6.1.4 (pg 108) is section 4.11 really the one you want to reference here? i'm not sure...?
recommendations to check things with dmesg are pretty bad imho. dmesg dumps a ring buffer into which the kernel prints over its lifetime. different output settings can filter messages from hitting there. what you almost certainly want is cpuid, cat /proc/cpuinfo, or rdmsr. furthermore, dmesg output is in no way a stable api, and it's not available to regular users depending on sysctls.

chapters/6-CPU-Features-For-Performance-Analysis/6-7 Precise Event Based Sampling (PEBS).md

dankamongmen · 2024-09-11T19:27:25Z

chapters/6-CPU-Features-For-Performance-Analysis/6-4 TMA-ARM.md

@@ -1,14 +1,14 @@
 ### TMA On ARM Platforms

-ARM CPU architects also have developed a TMA performance analysis methodology for their processors, which we will discuss next. ARM calls it "Topdown" in their documentation [@ARMNeoverseV1TopDown], so we will use their naming. At the time of writing this chapter (late 2023), Topdown is only supported on cores designed by ARM, e.g. Neoverse N1 and Neoverse V1, and their derivatives, e.g. Ampere Altra and AWS Graviton3. Refer to the list of major CPU microarchitectures at the end of this book if you need to refresh your memory on ARM chip families. Processors designed by Apple don't support the ARM Topdown performance analysis methodology yet.


what you have here works, but it violates a very common idiom and sounds weird

Please elaborate on what is wrong and suggest how I can change it.

chapters/6-CPU-Features-For-Performance-Analysis/6-4 TMA-ARM.md

dendibakh · 2024-09-24T17:12:45Z

Some notes:

Listing 6-2: please don't cast the return value of malloc(). this is a comp.lang.c FAQ: https://c-faq.com/malloc/mallocnocast.html .
If the code is C++, it shouldn't be using malloc().

Thanks. Will use new instead of malloc.

6.1.2: the reason there are no branch mispredictions in the SHA code is because cryptography code must carefully guard against dynamic branching/cache behavior to defend against timing attacks. fascinating stuff. check out DJB's papers.

Good to know.

6.1.4 (pg 108) is section 4.11 really the one you want to reference here? i'm not sure...?

Ok, I think I meant section 4.9

recommendations to check things with dmesg are pretty bad imho. dmesg dumps a ring buffer into which the kernel prints over its lifetime. different output settings can filter messages from hitting there. what you almost certainly want is cpuid, cat /proc/cpuinfo, or rdmsr. furthermore, dmesg output is in no way a stable api, and it's not available to regular users depending on sysctls.

It's been a long time since I looked at it, but cpuid may not be enough. It will show you static information (decoded from CPUID), but dmesg will also show if it is supported by the kernel. I didn't change it from the first edition. I think I borrowed it from some docs that recommended to check using dmesg. But yes, for long-running systems, required messages may not be in the buffer. That's a problem.

dendibakh · 2024-09-24T17:22:08Z

chapters/6-CPU-Features-For-Performance-Analysis/6-4 TMA-ARM.md

@@ -1,14 +1,14 @@
 ### TMA On ARM Platforms

-ARM CPU architects also have developed a TMA performance analysis methodology for their processors, which we will discuss next. ARM calls it "Topdown" in their documentation [@ARMNeoverseV1TopDown], so we will use their naming. At the time of writing this chapter (late 2023), Topdown is only supported on cores designed by ARM, e.g. Neoverse N1 and Neoverse V1, and their derivatives, e.g. Ampere Altra and AWS Graviton3. Refer to the list of major CPU microarchitectures at the end of this book if you need to refresh your memory on ARM chip families. Processors designed by Apple don't support the ARM Topdown performance analysis methodology yet.


Please elaborate on what is wrong and suggest how I can change it.

chapters/6-CPU-Features-For-Performance-Analysis/6-4 TMA-ARM.md

dankamongmen added 10 commits September 11, 2024 14:37

6-0: can be leveraged

0f4857f

5-1: spaces in step titles

f0ac94b

6.1.2: small changes

91503a2

6.1.3: number disagreements

6e00fe7

blah blah

effe496

LBR is pretty clean

d9e05d6

PEBS

c0966f1

kill wayward comma

8760d81

apostrophe for grouping, no

f0e2350

6-summary: make last item an actual item

e5b5c8e

dankamongmen commented Sep 11, 2024

View reviewed changes

chapters/6-CPU-Features-For-Performance-Analysis/6-7 Precise Event Based Sampling (PEBS).md Show resolved Hide resolved

dankamongmen commented Sep 11, 2024

View reviewed changes

chapters/6-CPU-Features-For-Performance-Analysis/6-4 TMA-ARM.md Outdated Show resolved Hide resolved

dendibakh approved these changes Sep 24, 2024

View reviewed changes

Update chapters/6-CPU-Features-For-Performance-Analysis/6-4 TMA-ARM.md

ba55c99

dendibakh merged commit 9feb18e into dendibakh:main Sep 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chapter 6 edits #73

Chapter 6 edits #73

dankamongmen commented Sep 11, 2024

dankamongmen Sep 11, 2024

dendibakh Sep 24, 2024

dendibakh commented Sep 24, 2024

dendibakh Sep 24, 2024

		@@ -1,14 +1,14 @@
		### TMA On ARM Platforms

		ARM CPU architects also have developed a TMA performance analysis methodology for their processors, which we will discuss next. ARM calls it "Topdown" in their documentation [@ARMNeoverseV1TopDown], so we will use their naming. At the time of writing this chapter (late 2023), Topdown is only supported on cores designed by ARM, e.g. Neoverse N1 and Neoverse V1, and their derivatives, e.g. Ampere Altra and AWS Graviton3. Refer to the list of major CPU microarchitectures at the end of this book if you need to refresh your memory on ARM chip families. Processors designed by Apple don't support the ARM Topdown performance analysis methodology yet.

Chapter 6 edits #73

Chapter 6 edits #73

Conversation

dankamongmen commented Sep 11, 2024

dankamongmen Sep 11, 2024

Choose a reason for hiding this comment

dendibakh Sep 24, 2024

Choose a reason for hiding this comment

dendibakh commented Sep 24, 2024

dendibakh Sep 24, 2024

Choose a reason for hiding this comment