Skip to content

[lldb][AArch64] Correctly invalidate svg when vg is written #140875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 19, 2025

Conversation

DavidSpickett
Copy link
Collaborator

@DavidSpickett DavidSpickett commented May 21, 2025

Recently the Linux Kernel has fixed a bunch of issues in SME support and while testing that, I found two tests failing:
FAIL: test_za_register_dynamic_config_main_disabled (TestZAThreadedDynamic.AArch64ZAThreadedTestCase)
FAIL: test_za_register_dynamic_config_main_enabled (TestZAThreadedDynamic.AArch64ZAThreadedTestCase)

These tests write to vg during streaming mode from lldb and expect to see that za has been resized to match it. Instead, it was unavailable. lldb-server was sending the correct amount of data but lldb client was expecting the old size.

Turns out that instead of a write to vg invalidating svg, it was invalidating... something else. I'm still not sure how these tests ever worked but with this one line fix, they pass again.

I did not see this issue with SVE or streaming SVE Z registers because those always resize using the value of vg, and vg always has the value we just wrote.

(remember that vg is the vector length of the current mode, not of non-streaming mode, whereas svg is the vector length of streaming mode, even if you are currently in non-streaming mode)

Recently the Linux Kernel has fixed a bunch of issues in SME support
and while testing that, I found two tests failing:
FAIL: test_za_register_dynamic_config_main_disabled (TestZAThreadedDynamic.AArch64ZAThreadedTestCase)
FAIL: test_za_register_dynamic_config_main_enabled (TestZAThreadedDynamic.AArch64ZAThreadedTestCase)

These tests write to vg during streaming mode from lldb and expect to see that za has been
resized to match it. Instead, it was unavilable. lldb-server was sending
the correct amount of data but lldb client was expect the old size.

Turns out that instead of a write to vg invalidating svg, it was invalidating...
something else. I'm still not sure how these tests ever worked but
with this one line fix, they pass again.

I did not see this issue with SVE or streaming SVE Z registers because
those always resize using the value of vg, and vg always has the value we
just wrote.

(remember that vg is the vector length of the **current** mode,
not of non-streaming mode, whereas svg is the vector length of streaming
mode, even if you are currently in non-streaming mode)
@llvmbot
Copy link
Member

llvmbot commented May 21, 2025

@llvm/pr-subscribers-lldb

Author: David Spickett (DavidSpickett)

Changes

Recently the Linux Kernel has fixed a bunch of issues in SME support and while testing that, I found two tests failing: FAIL: test_za_register_dynamic_config_main_disabled (TestZAThreadedDynamic.AArch64ZAThreadedTestCase) FAIL: test_za_register_dynamic_config_main_enabled (TestZAThreadedDynamic.AArch64ZAThreadedTestCase)

These tests write to vg during streaming mode from lldb and expect to see that za has been resized to match it. Instead, it was unavilable. lldb-server was sending the correct amount of data but lldb client was expect the old size.

Turns out that instead of a write to vg invalidating svg, it was invalidating... something else. I'm still not sure how these tests ever worked but with this one line fix, they pass again.

I did not see this issue with SVE or streaming SVE Z registers because those always resize using the value of vg, and vg always has the value we just wrote.

(remember that vg is the vector length of the current mode, not of non-streaming mode, whereas svg is the vector length of streaming mode, even if you are currently in non-streaming mode)


Full diff: https://github.com/llvm/llvm-project/pull/140875.diff

1 Files Affected:

  • (modified) lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp (+1-1)
diff --git a/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp b/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
index c004c0f3c3cf5..f2095c150609f 100644
--- a/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
+++ b/lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
@@ -425,7 +425,7 @@ void RegisterInfoPOSIX_arm64::AddRegSetSME(bool has_zt) {
   //
   // This must be added now, rather than when vg is defined because SME is a
   // dynamic set that may or may not be present.
-  static uint32_t vg_invalidates[] = {sme_regnum + 1 /*svg*/,
+  static uint32_t vg_invalidates[] = {first_sme_regnum + 1 /*svg*/,
                                       LLDB_INVALID_REGNUM};
   m_dynamic_reg_infos[GetRegNumSVEVG()].invalidate_regs = vg_invalidates;
 }

@DavidSpickett
Copy link
Collaborator Author

DavidSpickett commented May 21, 2025

Candidates for how it ever worked:

  • Out of bounds write managing to break something else in just the right way. Then adding new extension registers meant it was invalidating registers again, probably guarded control stack registers.
  • LLDB got less aggressive about re-reading registers. As you can "fix" this by re-reading svg a few times manually.
  • These tests were actually flaky because of this bug and I put that down to difficulties running in simulation instead of investigating properly.

In these new kernel patches there are some ptrace changes but nothing in lldb-server needed to change, and this mistake is so clear in hindsight, that I don't think it's a ptrace ABI difference.

@DavidSpickett
Copy link
Collaborator Author

ping!

1 similar comment
@DavidSpickett
Copy link
Collaborator Author

ping!

@jasonmolenda
Copy link
Collaborator

Sorry for missing this one earlier. I'm a little confused about earlier in this method,

for (uint32_t i = 0; i < k_num_sme_register; ++i, ++sme_regnum) {
    m_sme_regnum_collection.push_back(sme_regnum);
    m_dynamic_reg_infos.push_back(g_register_infos_sme[i]);
    m_dynamic_reg_infos[sme_regnum].byte_offset =
        m_dynamic_reg_infos[sme_regnum - 1].byte_offset +
        m_dynamic_reg_infos[sme_regnum - 1].byte_size;
    m_dynamic_reg_infos[sme_regnum].kinds[lldb::eRegisterKindLLDB] = sme_regnum;
  }
  lldb_private::RegisterSet sme_regset = g_reg_set_sme_arm64;
  if (has_zt) {
    m_sme_regnum_collection.push_back(sme_regnum);
    m_dynamic_reg_infos.push_back(g_register_infos_sme2[0]);
    m_dynamic_reg_infos[sme_regnum].byte_offset =
        m_dynamic_reg_infos[sme_regnum - 1].byte_offset +
        m_dynamic_reg_infos[sme_regnum - 1].byte_size;
    m_dynamic_reg_infos[sme_regnum].kinds[lldb::eRegisterKindLLDB] = sme_regnum;
    sme_regset.num_registers += 1;
  }

At the end of the loops, sme_regnum is the index of the final SME register we've added to m_dynamic_reg_infos. Then if we have the ZT0 register, we reuse that index a second time when adding ZT0 don't we. I don't know if it has anything to do with the issue at hand, but it seems like sme_regnum needs to be incremented if has_zt and we're adding one more to the vector.

@jasonmolenda
Copy link
Collaborator

Sorry for missing this one earlier. I'm a little confused about earlier in this method,

At the end of the loops, sme_regnum is the index of the final SME register we've added to m_dynamic_reg_infos. Then if we have the ZT0 register, we reuse that index a second time when adding ZT0 don't we. I don't know if it has anything to do with the issue at hand, but it seems like sme_regnum needs to be incremented if has_zt and we're adding one more to the vector.

lol of course I misunderstood the loop. After the last entry is added, the index and sme_regnum are incremented, then the comparison shows that we're beyond the number of sme registers, and it stops. With sme_regnum already positioned after the last sme reg.

@DavidSpickett
Copy link
Collaborator Author

lol of course I misunderstood the loop.

We didn't write it to be misleading but it sure does look like we did doesn't it.

One of these days I'll make this all data driven 🦄

Copy link
Collaborator

@jasonmolenda jasonmolenda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DavidSpickett DavidSpickett merged commit 6273c5d into llvm:main Jun 19, 2025
7 checks passed
@DavidSpickett DavidSpickett deleted the lldb-svg branch June 19, 2025 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants