Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parser for function calling #143

Closed
wants to merge 0 commits into from
Closed

Conversation

zanderjiang
Copy link
Contributor

No description provided.

Copy link
Collaborator

@Ubospica Ubospica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the contribution, please finish the Python part and tests


while (i < length) {
// <function> tags
if (i + 10 < length && message_body.substr(i, 10) == "<function=") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use std::string_view(s).substr(i, 10) to avoid allocating new memory

/******************* Parser *******************/

std::vector<std::pair<std::string, std::unordered_map<std::string, std::string>>> parse_message(
const std::string& message_body
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a param bool ignore_error, default false
if true, skip all error messages (just like now).

Otherwise, raise ValueError (ie. XGRAMMAR_CHECK or XGRAMMAR_LOG(FATAL)) at all error places (ie the "continue" places in the current code)


picojson::value v;
std::string err = picojson::parse(v, params_str);
if (!err.empty()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if error is not empty or v is not an object

continue;
}

if (v.is<picojson::object>()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shall check error first, instead of putting long piece of code in a nested scope. So

if (!...) { raise error; }

is better than

if (...) {
    ...
} else {
    raise error;
}


i = params_end + 11;
}
// json format
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add more detailed comments; provide an example

auto name_it = obj.find("name");
auto params_it = obj.find("parameters");

if (name_it != obj.end() && params_it != obj.end() && name_it->second.is<std::string>() &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not, report error or continue

int bracket_count = 1;
size_t j = i + 1;

while (j < length && bracket_count > 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer for loop than while


std::unordered_map<std::string, std::string> params;
for (const auto& pair : params_obj) {
if (pair.second.is<std::string>()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, check and report error

const std::string& message_body
) {
std::vector<std::pair<std::string, std::unordered_map<std::string, std::string>>> tool_calls;
size_t i = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use more meaningful var names, eg input_it, instead of i and j

params[pair.first] = pair.second.get<std::string>();
}
}
tool_calls.emplace_back(name, params);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tool_calls.emplace_back(name, params);
tool_calls.emplace_back(name, std::move(params));

@@ -86,7 +86,8 @@ PYBIND11_MODULE(xgrammar_bindings, m) {
bool>(&JSONSchemaToEBNF)
)
.def("_regex_to_ebnf", &RegexToEBNF)
.def("_get_masked_tokens_from_bitmask", &Matcher_DebugGetMaskedTokensFromBitmask);
.def("_get_masked_tokens_from_bitmask", &Matcher_DebugGetMaskedTokensFromBitmask)
.def("_parse_message", &parse_message, py::arg("input"), py::arg("ignore_error") = false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont need to specify args. Just set default args in python bindings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants