-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parser for function calling #143
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for the contribution, please finish the Python part and tests
cpp/function_calling.cc
Outdated
|
||
while (i < length) { | ||
// <function> tags | ||
if (i + 10 < length && message_body.substr(i, 10) == "<function=") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use std::string_view(s).substr(i, 10) to avoid allocating new memory
cpp/function_calling.cc
Outdated
/******************* Parser *******************/ | ||
|
||
std::vector<std::pair<std::string, std::unordered_map<std::string, std::string>>> parse_message( | ||
const std::string& message_body |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a param bool ignore_error, default false
if true, skip all error messages (just like now).
Otherwise, raise ValueError (ie. XGRAMMAR_CHECK or XGRAMMAR_LOG(FATAL)) at all error places (ie the "continue" places in the current code)
cpp/function_calling.cc
Outdated
|
||
picojson::value v; | ||
std::string err = picojson::parse(v, params_str); | ||
if (!err.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if error is not empty or v is not an object
cpp/function_calling.cc
Outdated
continue; | ||
} | ||
|
||
if (v.is<picojson::object>()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shall check error first, instead of putting long piece of code in a nested scope. So
if (!...) { raise error; }
is better than
if (...) {
...
} else {
raise error;
}
cpp/function_calling.cc
Outdated
|
||
i = params_end + 11; | ||
} | ||
// json format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add more detailed comments; provide an example
cpp/function_calling.cc
Outdated
auto name_it = obj.find("name"); | ||
auto params_it = obj.find("parameters"); | ||
|
||
if (name_it != obj.end() && params_it != obj.end() && name_it->second.is<std::string>() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not, report error or continue
cpp/function_calling.cc
Outdated
int bracket_count = 1; | ||
size_t j = i + 1; | ||
|
||
while (j < length && bracket_count > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer for loop than while
cpp/function_calling.cc
Outdated
|
||
std::unordered_map<std::string, std::string> params; | ||
for (const auto& pair : params_obj) { | ||
if (pair.second.is<std::string>()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, check and report error
cpp/function_calling.cc
Outdated
const std::string& message_body | ||
) { | ||
std::vector<std::pair<std::string, std::unordered_map<std::string, std::string>>> tool_calls; | ||
size_t i = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use more meaningful var names, eg input_it, instead of i and j
cpp/function_calling.cc
Outdated
params[pair.first] = pair.second.get<std::string>(); | ||
} | ||
} | ||
tool_calls.emplace_back(name, params); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tool_calls.emplace_back(name, params); | |
tool_calls.emplace_back(name, std::move(params)); |
cpp/pybind/pybind.cc
Outdated
@@ -86,7 +86,8 @@ PYBIND11_MODULE(xgrammar_bindings, m) { | |||
bool>(&JSONSchemaToEBNF) | |||
) | |||
.def("_regex_to_ebnf", &RegexToEBNF) | |||
.def("_get_masked_tokens_from_bitmask", &Matcher_DebugGetMaskedTokensFromBitmask); | |||
.def("_get_masked_tokens_from_bitmask", &Matcher_DebugGetMaskedTokensFromBitmask) | |||
.def("_parse_message", &parse_message, py::arg("input"), py::arg("ignore_error") = false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dont need to specify args. Just set default args in python bindings
No description provided.