Skip to content

CareQA env #33#48

Merged
warner-benjamin merged 14 commits intoMedARC-AI:mainfrom
Arya-Hari:main
Dec 12, 2025
Merged

CareQA env #33#48
warner-benjamin merged 14 commits intoMedARC-AI:mainfrom
Arya-Hari:main

Conversation

@Arya-Hari
Copy link
Contributor

Added environment for the CareQA dataset (#33).

@CLAassistant
Copy link

CLAassistant commented Oct 11, 2025

CLA assistant check
All committers have signed the CLA.

@Arya-Hari Arya-Hari marked this pull request as ready for review October 11, 2025 10:55
Copy link
Collaborator

@warner-benjamin warner-benjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. A few changes needed before it can be merged.

Assuming the authors don't state what their prompts are (I did a quick search and didn't find anything), we want to default to using verifiers' BOXED_SYSTEM_PROMPT and THINK_BOXED_SYSTEM_PROMPT for reasoning models. verifiers has a boxed format parser extract_boxed_answer in verifiers.utils.data_utils, and verifiers.ThinkParser to extract the answers from a reasoning model. Make sure to add the use_think: bool = False boolean flag so the user can opt into using the thinking prompt and parser. You can see an example of this in #19.

The LLM as a Judge implementation is incomplete and needs to be finished.

@warner-benjamin warner-benjamin merged commit 70f7acd into MedARC-AI:main Dec 12, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants