Skip to content

[lldb][AArch64] Correctly invalidate svg when vg is written #140875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

DavidSpickett
Copy link
Collaborator

@DavidSpickett DavidSpickett commented May 21, 2025

Recently the Linux Kernel has fixed a bunch of issues in SME support and while testing that, I found two tests failing:
FAIL: test_za_register_dynamic_config_main_disabled (TestZAThreadedDynamic.AArch64ZAThreadedTestCase)
FAIL: test_za_register_dynamic_config_main_enabled (TestZAThreadedDynamic.AArch64ZAThreadedTestCase)

These tests write to vg during streaming mode from lldb and expect to see that za has been resized to match it. Instead, it was unavailable. lldb-server was sending the correct amount of data but lldb client was expecting the old size.

Turns out that instead of a write to vg invalidating svg, it was invalidating... something else. I'm still not sure how these tests ever worked but with this one line fix, they pass again.

I did not see this issue with SVE or streaming SVE Z registers because those always resize using the value of vg, and vg always has the value we just wrote.

(remember that vg is the vector length of the current mode, not of non-streaming mode, whereas svg is the vector length of streaming mode, even if you are currently in non-streaming mode)

Recently the Linux Kernel has fixed a bunch of issues in SME support
and while testing that, I found two tests failing:
FAIL: test_za_register_dynamic_config_main_disabled (TestZAThreadedDynamic.AArch64ZAThreadedTestCase)
FAIL: test_za_register_dynamic_config_main_enabled (TestZAThreadedDynamic.AArch64ZAThreadedTestCase)

These tests write to vg during streaming mode from lldb and expect to see that za has been
resized to match it. Instead, it was unavilable. lldb-server was sending
the correct amount of data but lldb client was expect the old size.

Turns out that instead of a write to vg invalidating svg, it was invalidating...
something else. I'm still not sure how these tests ever worked but
with this one line fix, they pass again.

I did not see this issue with SVE or streaming SVE Z registers because
those always resize using the value of vg, and vg always has the value we
just wrote.

(remember that vg is the vector length of the **current** mode,
not of non-streaming mode, whereas svg is the vector length of streaming
mode, even if you are currently in non-streaming mode)
@llvmbot
Copy link
Member

llvmbot commented May 21, 2025

@llvm/pr-subscribers-lldb

Author: David Spickett (DavidSpickett)

Changes

Recently the Linux Kernel has fixed a bunch of issues in SME support and while testing that, I found two tests failing: FAIL: test_za_register_dynamic_config_main_disabled (TestZAThreadedDynamic.AArch64ZAThreadedTestCase) FAIL: test_za_register_dynamic_config_main_enabled (TestZAThreadedDynamic.AArch64ZAThreadedTestCase)

These tests write to vg during streaming mode from lldb and expect to see that za has been resized to match it. Instead, it was unavilable. lldb-server was sending the correct amount of data but lldb client was expect the old size.

Turns out that instead of a write to vg invalidating svg, it was invalidating... something else. I'm still not sure how these tests ever worked but with this one line fix, they pass again.

I did not see this issue with SVE or streaming SVE Z registers because those always resize using the value of vg, and vg always has the value we just wrote.

(remember that vg is the vector length of the current mode, not of non-streaming mode, whereas svg is the vector length of streaming mode, even if you are currently in non-streaming mode)


Full diff: https://github.com/llvm/llvm-project/pull/140875.diff

1 Files Affected:

  • (modified) lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp (+1-1)
diff --git a/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp b/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
index c004c0f3c3cf5..f2095c150609f 100644
--- a/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
+++ b/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
@@ -425,7 +425,7 @@ void RegisterInfoPOSIX_arm64::AddRegSetSME(bool has_zt) {
   //
   // This must be added now, rather than when vg is defined because SME is a
   // dynamic set that may or may not be present.
-  static uint32_t vg_invalidates[] = {sme_regnum + 1 /*svg*/,
+  static uint32_t vg_invalidates[] = {first_sme_regnum + 1 /*svg*/,
                                       LLDB_INVALID_REGNUM};
   m_dynamic_reg_infos[GetRegNumSVEVG()].invalidate_regs = vg_invalidates;
 }

@DavidSpickett
Copy link
Collaborator Author

DavidSpickett commented May 21, 2025

Candidates for how it ever worked:

  • Out of bounds write managing to break something else in just the right way. Then adding new extension registers meant it was invalidating registers again, probably guarded control stack registers.
  • LLDB got less aggressive about re-reading registers. As you can "fix" this by re-reading svg a few times manually.
  • These tests were actually flaky because of this bug and I put that down to difficulties running in simulation instead of investigating properly.

In these new kernel patches there are some ptrace changes but nothing in lldb-server needed to change, and this mistake is so clear in hindsight, that I don't think it's a ptrace ABI difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants