FEAT added support for scorers powered by thinking models #719

joaodunas · 2025-02-17T13:35:25Z

Description

Changed the function _score_value_with_llm (pyrit/score/scorer.py) to support thinking models and any cases where the scorer model does not output only the requested json.

Tests and Documentation

Haven't made any tests as of yet, mainly due to lack of knowledge regarding how to do it and lack of time

romanlutz · 2025-02-17T15:31:40Z

Thanks for this submission.

Can you elaborate on the problem? The code doesn't really tell me what you are running into.

What do you mean by "thinking"? Perhaps the reasoning steps models like o1 take?

In general, I think it's best to start with an issue and decide there whether or not a PR is needed, and what the change should be. It usually saves you time because maintainers can advise on what the change should look like. For example, making this change would break a lot of functionality in PyRIT as existing model endpoints depend on that logic.

Let me know what you think so we can figure out a solution for the problem.

romanlutz

As discussed over on Discord, let's move this to remove_markdown_json, and rename that function to something like extract_json. Very cool! A few lines describing all the cases this can handle would also be nice. And we probably want to catch the index out of bound case specifically rather than doing a bare except.

…scorer

romanlutz

Looks great! Can we add a few basic unit tests? Should include

Empty string
Valid json only
Valid json with other characters before (at least one case with markdown)
Valid json with other characters after
Valid Jason between other characters around it
JSON parsing failure

romanlutz · 2025-03-14T01:52:49Z

pyrit/score/scorer.py

@@ -273,7 +273,13 @@ async def _score_value_with_llm(
        try:
            response_json = response.request_pieces[0].converted_value

-            response_json = remove_markdown_json(response_json)
+            try:


Why do you have another version of it here? It's not obvious to me that it has different functionality. Can you elaborate?

romanlutz · 2025-03-19T04:37:27Z

pyrit/exceptions/exceptions_helpers.py

+        response = response.split("}")[0] + "}"
+        return json.loads(response)
+    except (json.JSONDecodeError, IndexError):
+        return "Invalid JSON response: {}".format(response_msg)


this needs to raise InvalidJsonException to trigger retries

FEAT added support for scorers powered by thinking models

8f06d77

added try and except

44fe0ae

romanlutz requested a review from rlundeen2 February 24, 2025 23:19

Merge branch 'main' into think_scorer

6229650

joaodunas mentioned this pull request Mar 10, 2025

FEAT add support for reasoning models as scorers #774

Open

romanlutz reviewed Mar 10, 2025

View reviewed changes

joaodunas and others added 3 commits March 13, 2025 22:09

Merge branch 'Azure:main' into think_scorer

cd9a357

Changed remove_markdown_json to extract_json_from_response

5729199

Merge branch 'think_scorer' of github.com:joaodunas/PyRIT into think_…

5e84a77

…scorer

romanlutz reviewed Mar 14, 2025

View reviewed changes

Merge branch 'main' into think_scorer

5585b27

romanlutz reviewed Mar 19, 2025

View reviewed changes

romanlutz and others added 2 commits March 19, 2025 00:56

Merge branch 'main' into think_scorer

1b6a884

Merge branch 'Azure:main' into think_scorer

78961f5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT added support for scorers powered by thinking models #719

FEAT added support for scorers powered by thinking models #719

joaodunas commented Feb 17, 2025

romanlutz commented Feb 17, 2025

romanlutz left a comment

romanlutz left a comment

romanlutz Mar 14, 2025

romanlutz Mar 19, 2025

FEAT added support for scorers powered by thinking models #719

Are you sure you want to change the base?

FEAT added support for scorers powered by thinking models #719

Conversation

joaodunas commented Feb 17, 2025

Description

Tests and Documentation

romanlutz commented Feb 17, 2025

romanlutz left a comment

Choose a reason for hiding this comment

romanlutz left a comment

Choose a reason for hiding this comment

romanlutz Mar 14, 2025

Choose a reason for hiding this comment

romanlutz Mar 19, 2025

Choose a reason for hiding this comment