Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT added support for scorers powered by thinking models #719

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

joaodunas
Copy link
Contributor

Description

Changed the function _score_value_with_llm (pyrit/score/scorer.py) to support thinking models and any cases where the scorer model does not output only the requested json.

Tests and Documentation

Haven't made any tests as of yet, mainly due to lack of knowledge regarding how to do it and lack of time

@romanlutz
Copy link
Contributor

Thanks for this submission.

Can you elaborate on the problem? The code doesn't really tell me what you are running into.

What do you mean by "thinking"? Perhaps the reasoning steps models like o1 take?

In general, I think it's best to start with an issue and decide there whether or not a PR is needed, and what the change should be. It usually saves you time because maintainers can advise on what the change should look like. For example, making this change would break a lot of functionality in PyRIT as existing model endpoints depend on that logic.

Let me know what you think so we can figure out a solution for the problem.

@romanlutz romanlutz requested a review from rlundeen2 February 24, 2025 23:19
Copy link
Contributor

@romanlutz romanlutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed over on Discord, let's move this to remove_markdown_json, and rename that function to something like extract_json. Very cool! A few lines describing all the cases this can handle would also be nice. And we probably want to catch the index out of bound case specifically rather than doing a bare except.

Copy link
Contributor

@romanlutz romanlutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Can we add a few basic unit tests? Should include

Empty string
Valid json only
Valid json with other characters before (at least one case with markdown)
Valid json with other characters after
Valid Jason between other characters around it
JSON parsing failure

@@ -273,7 +273,13 @@ async def _score_value_with_llm(
try:
response_json = response.request_pieces[0].converted_value

response_json = remove_markdown_json(response_json)
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you have another version of it here? It's not obvious to me that it has different functionality. Can you elaborate?

response = response.split("}")[0] + "}"
return json.loads(response)
except (json.JSONDecodeError, IndexError):
return "Invalid JSON response: {}".format(response_msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to raise InvalidJsonException to trigger retries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants