Ability to run other GPT Models

For an evaluation of different GPT model versions that got published before and after the DecodingTrust benchmark, I would like to evaluate 

GPT-3.5-Turbo / GPT-3.5-Turbo-1106 / GPT-3.5-Turbo-0125 / 
GPT-4 / GPT-4-0613 / GPT-4-1106-preview / gpt-4-turbo-2024-04-09

However, as far as I can deduce, the benchmark evaluation (in this case for toxicity) uses the crfm-helm repository at version 0.2.3, which only comes with three GPT-3.5-turbo versions, all of which are deprecated and not useable.

Is there any way of using other GPT models like the ones mentioned above? I've tried upgrading the crfm-helm package, however that leaves me with so many changes and other problems that this does not feel feasable, if possible at all.

I would very much appreciate any help on this matter.

Thanks a lot,

Leon

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ability to run other GPT Models #58

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ability to run other GPT Models #58

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions