Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc++] Fix basic_string not allowing max_size() elements to be stored #125423

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

philnik777
Copy link
Contributor

Without this patch basic_string cannot be properly resized to be max_size() elements in size, even if an allocation is successful. __grow_by allocates one less element than required, resulting in an out-of-bounds access. At the same time, max_size() has an off-by-one error, since there has to be space to store the null terminator, which is currently ignored.

@philnik777 philnik777 requested a review from a team as a code owner February 2, 2025 17:07
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Feb 2, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 2, 2025

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes

Without this patch basic_string cannot be properly resized to be max_size() elements in size, even if an allocation is successful. __grow_by allocates one less element than required, resulting in an out-of-bounds access. At the same time, max_size() has an off-by-one error, since there has to be space to store the null terminator, which is currently ignored.


Full diff: https://github.com/llvm/llvm-project/pull/125423.diff

5 Files Affected:

  • (modified) libcxx/include/string (+3-3)
  • (modified) libcxx/test/libcxx/strings/basic.string/string.capacity/max_size.pass.cpp (+10-10)
  • (modified) libcxx/test/std/strings/basic.string/string.capacity/max_size.pass.cpp (+9-1)
  • (modified) libcxx/test/std/strings/basic.string/string.capacity/resize_size.pass.cpp (+27-17)
  • (modified) libcxx/test/support/min_allocator.h (+28)
diff --git a/libcxx/include/string b/libcxx/include/string
index fdd8085106dcc6..892ae97518f719 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -1305,10 +1305,10 @@ public:
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 size_type max_size() const _NOEXCEPT {
     size_type __m = __alloc_traits::max_size(__alloc_);
     if (__m <= std::numeric_limits<size_type>::max() / 2) {
-      return __m - __alignment;
+      return __m - __alignment - 1;
     } else {
       bool __uses_lsb = __endian_factor == 2;
-      return __uses_lsb ? __m - __alignment : (__m / 2) - __alignment;
+      return __uses_lsb ? __m - __alignment - 1 : (__m / 2) - __alignment - 1;
     }
   }
 
@@ -2558,7 +2558,7 @@ _LIBCPP_DEPRECATED_("use __grow_by_without_replace") basic_string<_CharT, _Trait
     __throw_length_error();
   pointer __old_p = __get_pointer();
   size_type __cap =
-      __old_cap < __ms / 2 - __alignment ? __recommend(std::max(__old_cap + __delta_cap, 2 * __old_cap)) : __ms - 1;
+      __old_cap < __ms / 2 - __alignment ? __recommend(std::max(__old_cap + __delta_cap, 2 * __old_cap)) : __ms;
   auto __allocation = std::__allocate_at_least(__alloc_, __cap + 1);
   pointer __p       = __allocation.ptr;
   __begin_lifetime(__p, __allocation.count);
diff --git a/libcxx/test/libcxx/strings/basic.string/string.capacity/max_size.pass.cpp b/libcxx/test/libcxx/strings/basic.string/string.capacity/max_size.pass.cpp
index 726570beb6d1ae..e097c70ec79a7c 100644
--- a/libcxx/test/libcxx/strings/basic.string/string.capacity/max_size.pass.cpp
+++ b/libcxx/test/libcxx/strings/basic.string/string.capacity/max_size.pass.cpp
@@ -23,44 +23,44 @@ static const std::size_t alignment = 8;
 template <class = int>
 TEST_CONSTEXPR_CXX20 void full_size() {
   std::string str;
-  assert(str.max_size() == std::numeric_limits<std::size_t>::max() - alignment);
+  assert(str.max_size() == std::numeric_limits<std::size_t>::max() - alignment - 1);
 
 #ifndef TEST_HAS_NO_CHAR8_T
   std::u8string u8str;
-  assert(u8str.max_size() == std::numeric_limits<std::size_t>::max() - alignment);
+  assert(u8str.max_size() == std::numeric_limits<std::size_t>::max() - alignment - 1);
 #endif
 
 #ifndef TEST_HAS_NO_WIDE_CHARACTERS
   std::wstring wstr;
-  assert(wstr.max_size() == std::numeric_limits<std::size_t>::max() / sizeof(wchar_t) - alignment);
+  assert(wstr.max_size() == std::numeric_limits<std::size_t>::max() / sizeof(wchar_t) - alignment - 1);
 #endif
 
   std::u16string u16str;
   std::u32string u32str;
-  assert(u16str.max_size() == std::numeric_limits<std::size_t>::max() / 2 - alignment);
-  assert(u32str.max_size() == std::numeric_limits<std::size_t>::max() / 4 - alignment);
+  assert(u16str.max_size() == std::numeric_limits<std::size_t>::max() / 2 - alignment - 1);
+  assert(u32str.max_size() == std::numeric_limits<std::size_t>::max() / 4 - alignment - 1);
 }
 
 template <class = int>
 TEST_CONSTEXPR_CXX20 void half_size() {
   std::string str;
-  assert(str.max_size() == std::numeric_limits<std::size_t>::max() / 2 - alignment);
+  assert(str.max_size() == std::numeric_limits<std::size_t>::max() / 2 - alignment - 1);
 
 #ifndef TEST_HAS_NO_CHAR8_T
   std::u8string u8str;
-  assert(u8str.max_size() == std::numeric_limits<std::size_t>::max() / 2 - alignment);
+  assert(u8str.max_size() == std::numeric_limits<std::size_t>::max() / 2 - alignment - 1);
 #endif
 
 #ifndef TEST_HAS_NO_WIDE_CHARACTERS
   std::wstring wstr;
   assert(wstr.max_size() ==
-         std::numeric_limits<std::size_t>::max() / std::max<size_t>(2ul, sizeof(wchar_t)) - alignment);
+         std::numeric_limits<std::size_t>::max() / std::max<size_t>(2ul, sizeof(wchar_t)) - alignment - 1);
 #endif
 
   std::u16string u16str;
   std::u32string u32str;
-  assert(u16str.max_size() == std::numeric_limits<std::size_t>::max() / 2 - alignment);
-  assert(u32str.max_size() == std::numeric_limits<std::size_t>::max() / 4 - alignment);
+  assert(u16str.max_size() == std::numeric_limits<std::size_t>::max() / 2 - alignment - 1);
+  assert(u32str.max_size() == std::numeric_limits<std::size_t>::max() / 4 - alignment - 1);
 }
 
 TEST_CONSTEXPR_CXX20 bool test() {
diff --git a/libcxx/test/std/strings/basic.string/string.capacity/max_size.pass.cpp b/libcxx/test/std/strings/basic.string/string.capacity/max_size.pass.cpp
index b9ffffc0993af8..ff38a59cb00e58 100644
--- a/libcxx/test/std/strings/basic.string/string.capacity/max_size.pass.cpp
+++ b/libcxx/test/std/strings/basic.string/string.capacity/max_size.pass.cpp
@@ -53,7 +53,7 @@ TEST_CONSTEXPR_CXX20 void test_resize_max_size(const S& s) {
   } catch (const std::bad_alloc&) {
     return;
   }
-  assert(s.size() == sz);
+  assert(s2.size() == sz);
 }
 
 template <class S>
@@ -91,8 +91,16 @@ TEST_CONSTEXPR_CXX20 bool test() {
   test_string<std::string>();
 #if TEST_STD_VER >= 11
   test_string<std::basic_string<char, std::char_traits<char>, min_allocator<char> > >();
+  test_string<std::basic_string<char, std::char_traits<char>, tiny_size_allocator<64, char> > >();
 #endif
 
+  { // Test resizing where we can assume that the allocation succeeds
+    std::basic_string<char, std::char_traits<char>, tiny_size_allocator<32, char> > str;
+    auto max_size = str.max_size();
+    str.resize(max_size);
+    assert(str.size() == max_size);
+  }
+
   return true;
 }
 
diff --git a/libcxx/test/std/strings/basic.string/string.capacity/resize_size.pass.cpp b/libcxx/test/std/strings/basic.string/string.capacity/resize_size.pass.cpp
index 7cf4b7ca3b6efd..99b654cf6b6a51 100644
--- a/libcxx/test/std/strings/basic.string/string.capacity/resize_size.pass.cpp
+++ b/libcxx/test/std/strings/basic.string/string.capacity/resize_size.pass.cpp
@@ -20,22 +20,10 @@
 
 template <class S>
 TEST_CONSTEXPR_CXX20 void test(S s, typename S::size_type n, S expected) {
-  if (n <= s.max_size()) {
-    s.resize(n);
-    LIBCPP_ASSERT(s.__invariants());
-    assert(s == expected);
-    LIBCPP_ASSERT(is_string_asan_correct(s));
-  }
-#ifndef TEST_HAS_NO_EXCEPTIONS
-  else if (!TEST_IS_CONSTANT_EVALUATED) {
-    try {
-      s.resize(n);
-      assert(false);
-    } catch (std::length_error&) {
-      assert(n > s.max_size());
-    }
-  }
-#endif
+  s.resize(n);
+  LIBCPP_ASSERT(s.__invariants());
+  assert(s == expected);
+  LIBCPP_ASSERT(is_string_asan_correct(s));
 }
 
 template <class S>
@@ -56,7 +44,26 @@ TEST_CONSTEXPR_CXX20 void test_string() {
   test(S("12345678901234567890123456789012345678901234567890"),
        60,
        S("12345678901234567890123456789012345678901234567890\0\0\0\0\0\0\0\0\0\0", 60));
-  test(S(), S::npos, S("not going to happen"));
+}
+
+template <class CharT>
+TEST_CONSTEXPR_CXX20 void test_max_size() {
+#ifndef TEST_HAS_NO_EXCEPTIONS
+  if (!TEST_IS_CONSTANT_EVALUATED) {
+    std::basic_string<CharT> str;
+    try {
+      str.resize(std::string::npos);
+      assert(false);
+    } catch (const std::length_error&) {
+    }
+  }
+#endif
+
+  {
+    std::basic_string<CharT, std::char_traits<CharT>, tiny_size_allocator<32, CharT>> str;
+    str.resize(str.max_size());
+    assert(str.size() == str.max_size());
+  }
 }
 
 TEST_CONSTEXPR_CXX20 bool test() {
@@ -66,6 +73,9 @@ TEST_CONSTEXPR_CXX20 bool test() {
   test_string<std::basic_string<char, std::char_traits<char>, safe_allocator<char>>>();
 #endif
 
+  test_max_size<char>();
+  test_max_size<wchar_t>();
+
   return true;
 }
 
diff --git a/libcxx/test/support/min_allocator.h b/libcxx/test/support/min_allocator.h
index 18f51f8072640d..f050267dc4ef75 100644
--- a/libcxx/test/support/min_allocator.h
+++ b/libcxx/test/support/min_allocator.h
@@ -480,4 +480,32 @@ class safe_allocator {
   TEST_CONSTEXPR_CXX20 friend bool operator!=(safe_allocator x, safe_allocator y) { return !(x == y); }
 };
 
+template <std::size_t MaxSize, class T>
+struct tiny_size_allocator {
+  using value_type = T;
+  using size_type = unsigned;
+
+  template <class U>
+  struct rebind {
+    using other = tiny_size_allocator<MaxSize, T>;
+  };
+
+  tiny_size_allocator() = default;
+
+  template <class U>
+  TEST_CONSTEXPR_CXX20 tiny_size_allocator(tiny_size_allocator<MaxSize, U>) {}
+
+  TEST_CONSTEXPR_CXX20 T* allocate(std::size_t n) {
+    assert(n <= MaxSize);
+    return std::allocator<T>{}.allocate(n);
+  }
+
+  TEST_CONSTEXPR_CXX20 void deallocate(T* ptr, std::size_t n) { std::allocator<T>{}.deallocate(ptr, n); }
+
+  TEST_CONSTEXPR_CXX20 size_type max_size() const { return MaxSize; }
+
+  friend bool operator==(tiny_size_allocator, tiny_size_allocator) { return true; }
+  friend bool operator!=(tiny_size_allocator, tiny_size_allocator) { return false; }
+};
+
 #endif // MIN_ALLOCATOR_H

Copy link

github-actions bot commented Feb 2, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@mordante mordante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general happy with the fix, one comment.

} else {
bool __uses_lsb = __endian_factor == 2;
return __uses_lsb ? __m - __alignment : (__m / 2) - __alignment;
return __uses_lsb ? __m - __alignment - 1 : (__m / 2) - __alignment - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is an observable change I'd like a note in the release notes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug fix as every other one is, and we usually don't release note them. What's different here?

@philnik777 philnik777 force-pushed the fix_max_size branch 2 times, most recently from 182797e to ffe6b6d Compare February 19, 2025 13:34
@ldionne ldionne added the pending-ci Merging the PR is only pending completion of CI label Feb 19, 2025
@@ -1302,10 +1302,10 @@ public:
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 size_type max_size() const _NOEXCEPT {
size_type __m = __alloc_traits::max_size(__alloc_);
if (__m <= std::numeric_limits<size_type>::max() / 2) {
return __m - __alignment;
return ((__m - __alignment) & ~size_type(__endian_factor - 1)) - 1;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has changed in the latest revision. Previously the capacity would be one too low when the max_size() is even with __endian_factor == 2. In that mode we require the capacity (including the null terminator) to always be even, which is violated with max_size() (which returns the size without the null terminator) returning an even number.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the little table I did for my understanding:

endian factor = 1
-----------------
factor-1 = 0000000000000000000000
~        = 1111111111111111111111
m-align  = xxxxxxxxxxxxxxxxxxxxxx
&        = xxxxxxxxxxxxxxxxxxxxxx
So the result is unchanged

endian factor = 2
-----------------
factor-1 = 0000000000000000000001
~        = 1111111111111111111110
m-align  = xxxxxxxxxxxxxxxxxxxxxx
&        = xxxxxxxxxxxxxxxxxxxxx0
So the bitand is always even, and the end result is always odd because we subtract 1

Since __endian_factor is a compile-time constant we should be able to rewrite like this without impacting codegen (IMO this is easier to follow):

if (__m <= std::numeric_limits<size_type>::max() / 2) {
  size_type __res = __m - __alignment;
  if (__endian_factor == 2)
    __res &= ~size_type(1); // make even, explain why in a short comment
  __res -= 1; // max_size() doesn't include the null terminator
}

WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. pending-ci Merging the PR is only pending completion of CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants