Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out-of-bounds Read in BehaviorTree.CPP Script Validation #923

Open
cktii opened this issue Feb 3, 2025 · 1 comment
Open

Out-of-bounds Read in BehaviorTree.CPP Script Validation #923

cktii opened this issue Feb 3, 2025 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@cktii
Copy link

cktii commented Feb 3, 2025

When parsing invalid scripts that generate large error messages, the ValidateScript function may attempt to create a string from a non-NULL-terminated buffer, causing strlen to read beyond the buffer's bounds. This occurs because a fixed-size stack buffer is used to store error messages without ensuring proper NULL termination.

Result ValidateScript(const std::string& script)

Found in commit: 48f6c5b

Bug Class

Out-of-bounds Read / Buffer Over-read (CWE-125)

Root Cause

In src/script_parser.cpp, the ValidateScript function allocates a fixed-size buffer of 2048 bytes for error messages:

Result ValidateScript(const std::string& script)
{
  char error_msgs_buffer[2048];

  auto input = lexy::string_input<lexy::utf8_encoding>(script);
  auto result =
      lexy::parse<BT::Grammar::stmt>(input, ErrorReport().to(error_msgs_buffer));
  if(result.has_value() && result.error_count() == 0)
  {
    try
    {
      std::vector<BT::Ast::ExprBase::Ptr> exprs = LEXY_MOV(result).value();
      if(exprs.empty())
      {
        return nonstd::make_unexpected("Empty Script");
      }
      // valid script
      return {};
    }
    catch(std::runtime_error& err)
    {
      return nonstd::make_unexpected(err.what());
    }
  }
  return nonstd::make_unexpected(error_msgs_buffer);
}

}  // namespace BT

This buffer is passed to lexy's error reporter, which fills it with error messages. When parsing fails, the buffer content is used to construct a string via nonstd::make_unexpected.
The string construction internally calls strlen to determine the buffer length. However, there's no guarantee that:

  • The buffer is NULL-terminated
  • The error messages fit within the buffer
  • The last byte is reserved for a NULL terminator

This leads to strlen reading past the buffer bounds looking for a NULL terminator that isn't there.

Stack trace

==2543337==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffff5a09820 at pc 0x5555555ae3af bp 0x7fffffffcd10 sp 0x7fffffffc4d8
READ of size 2049 at 0x7ffff5a09820 thread T0
    #0 0x5555555ae3ae in strlen (/home/user/BehaviorTree.CPP/build3/script_fuzzer+0x5a3ae) (BuildId: 390cd92534c04116bddfa049abd4832d7b583113)
    #1 0x555555692d1c in std::char_traits<char>::length(char const*) /usr/lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/char_traits.h:399:9
    #2 0x555555679d3c in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::basic_string<std::allocator<char>>(char const*, std::allocator<char> const&) /usr/lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/basic_string.h:648:30
    #3 0x5555558ff01b in nonstd::expected_lite::expected<std::monostate, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>::expected<char*, 0>(nonstd::expected_lite::unexpected_type<char*>&&) /home/user/BehaviorTree.CPP/include/behaviortree_cpp/contrib/expected.hpp:1979:36
    #4 0x5555558fe109 in BT::ValidateScript(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) /home/user/BehaviorTree.CPP/src/script_parser.cpp:94:10
[...]

Reproducer

#include "behaviortree_cpp/scripting/script_parser.hpp"
#include <iostream>

int main()
{
  std::string script = "+6e66>6666.6+66\r6>6;6e62=6+6e66>66666'; en';o';o'; en'; "
                       "\177n\177r;6.6+66.H>6+6\2006\036;@e66";

  auto result = BT::ValidateScript(script);

  if(!result)
  {
    std::cout << "Script validation failed as expected" << std::endl;
  }

  return 0;
}

NOTE: This requires that libbheaviortree.so is compiled with -fsanitize=address.

Proposed fix

I'm by no means an expert for lexy, but given that lexy's parse function accepts a template callback for error handling:

/// Parses the production into a value, invoking the callback on error.
template <typename Production, typename Input, typename ErrorCallback>
constexpr auto parse(const Input& input, const ErrorCallback& callback)
{
    return parse_action<void, Input, ErrorCallback>(callback)(Production{}, input);
}

A recommended fix seems to implement a safe error callback that manages its own memory instead of a fix-sized stack-based buffer, something similar to:

class SafeErrorReport {
    std::string error_buffer;
    std::size_t count = 0;

    struct sink {
        std::string& buffer;
        std::size_t& count;

        template <typename Input, typename Reader, typename Tag>
        void operator()(const lexy::error_context<Input>& context,
                       const lexy::error<Reader, Tag>& error) {
            buffer += "error: while parsing ";
            buffer += context.production();
            buffer += "\n";
            count++;
        }

        std::size_t finish() && {
            return count;
        }
    };

public:
    auto sink() { return sink{error_buffer, count}; }
    const std::string& get_errors() const { return error_buffer; }
};

This is based on my understanding of lexy's error_report.hpp. That would turn ValidateScript into something like this:

Result ValidateScript(const std::string& script)
{
    auto input = lexy::string_input<lexy::utf8_encoding>(script);
    SafeErrorReport error_report;  // Replace char buffer with our safe handler
    
    auto result = lexy::parse<BT::Grammar::stmt>(input, error_report);
    
    if(!result.has_value() || result.error_count() != 0)
    {
        return nonstd::make_unexpected(error_report.get_errors());
    }

    try
    {
        std::vector<BT::Ast::ExprBase::Ptr> exprs = LEXY_MOV(result).value();
        if(exprs.empty())
        {
            return nonstd::make_unexpected("Empty Script");
        }
        // valid script
        return {};
    }
    catch(std::runtime_error& err)
    {
        return nonstd::make_unexpected(err.what());
    }
}

NOTE: Not 100% sure on this fix, just loosely based on my quick understand of lexy's code..

Impact

The vulnerability could allow an attacker to cause out-of-bounds memory reads by providing specially crafted invalid scripts. While this doesn't lead to direct code execution, it could potentially:

  • Lead to information disclosure if sensitive data is stored adjacent to the buffer
  • Cause program crashes (DoS)
@cktii
Copy link
Author

cktii commented Feb 3, 2025

Reproducer that doesn't require ASAN

#include "behaviortree_cpp/scripting/script_parser.hpp"
#include <iostream>
#include <string>

int main()
{
  std::string script;
  for(int i = 0; i < 10; i++)
  {
    script += "+6e66>6666.6+66\r6>6;6e62=6+6e66>66666'; en';o';o'; en'; "
              "\177n\177r;6.6+66.H>6+6\2006\036;@e66";
  }

  volatile auto result = BT::ValidateScript(script);
  return 0;
}

That first reproducer was a single-byte OOB-R and required the library to be compiled with AdressSanitizer. This second one crashes the normal Release type library and causes a segmentation fault.

@facontidavide facontidavide self-assigned this Feb 9, 2025
@facontidavide facontidavide added the bug Something isn't working label Feb 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants