From 12621cb33841f15e2d28acdb71154b6ff3e7f5f4 Mon Sep 17 00:00:00 2001 From: Tejal Patwardhan Date: Wed, 26 Feb 2025 10:30:54 -0800 Subject: [PATCH] installation (#47) --- project/nanoeval/README.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/project/nanoeval/README.md b/project/nanoeval/README.md index 835f1c7..9a9bd8d 100644 --- a/project/nanoeval/README.md +++ b/project/nanoeval/README.md @@ -2,17 +2,6 @@ Simple, ergonomic, and high performance evals. We use it at OpenAI as part of our infrastructure to run Preparedness evaluations. -# Installation - -```bash -# Using https://github.com/astral-sh/uv (recommended) -uv add "git+https://github.com/openai/SWELancer-Benchmark#egg=nanoeval&subdirectory=project/nanoeval" -# Using pip -pip install "git+https://github.com/openai/SWELancer-Benchmark#egg=nanoeval&subdirectory=project/nanoeval" -``` - -nanoeval is pre-release software and may have breaking changes, so it's recommended that you pin your installation to a specific commit. The uv command above will do this for you. - # Principles 1. **Minimal indirection.** You should be able to implement and understand an eval in 100 lines. @@ -27,6 +16,17 @@ nanoeval is pre-release software and may have breaking changes, so it's recommen - `Task` - A single scoreable unit of work. - `Solver` - A strategy (usually involving sampling a model) to go from a task to a result that can be scored. For example, there may be different ways to prompt a model to answer a multiple-choice question (i.e. looking at logits, few-shot prompting, etc) +# Installation + +```bash +# Using https://github.com/astral-sh/uv (recommended) +uv add "git+https://github.com/openai/SWELancer-Benchmark#egg=nanoeval&subdirectory=project/nanoeval" +# Using pip +pip install "git+https://github.com/openai/SWELancer-Benchmark#egg=nanoeval&subdirectory=project/nanoeval" +``` + +nanoeval is pre-release software and may have breaking changes, so it's recommended that you pin your installation to a specific commit. The uv command above will do this for you. + # Running your first eval See [gpqa_api.py](nanoeval/examples/gpqa_api.py) for an implementation of GPQA using the OpenAI API in <70 lines of code.