Skip to content

Commit

Permalink
Merge Dockerfiles (#316)
Browse files Browse the repository at this point in the history
* remove olmo dockerfile, update deps, update docs

* remove old image from docs and scripts

---------

Co-authored-by: Hamish Ivison <[email protected]>
  • Loading branch information
hamishivi and Hamish Ivison authored Aug 30, 2024
1 parent ee5ecd6 commit 7e49962
Show file tree
Hide file tree
Showing 9 changed files with 13 additions and 257 deletions.
80 changes: 0 additions & 80 deletions .github/workflows/push-image-olmo.yml

This file was deleted.

4 changes: 2 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -87,9 +87,9 @@ ENV HF_HUB_ENABLE_HF_TRANSFER=1
COPY requirements.txt .
RUN pip install --upgrade pip "setuptools<70.0.0" wheel
# TODO, unpin setuptools when this issue in flash attention is resolved
RUN pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121
RUN pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu121
RUN pip install packaging
RUN pip install flash-attn==2.5.8 --no-build-isolation
RUN pip install flash-attn==2.6.3 --no-build-isolation
RUN pip install -r requirements.txt

# NLTK download
Expand Down
113 changes: 0 additions & 113 deletions Dockerfile.olmo

This file was deleted.

10 changes: 2 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,11 +63,6 @@ beaker image create open_instruct -n open_instruct -w ai2/$(whoami)
If you are internally at AI2, you can use this pre-built beaker image `hamishivi/open-instruct-eval` (most recent version [here](https://beaker.org/im/01J2CKY81A6N1WG5QS08Y3WNM5/details)). For finetuning, you can use `hamishivi/open-instruct-public` (most recent version [here](https://beaker.org/im/01J2CQFX7076PDHZJR2GB0C3A9/details)). I will try to update these periodically.


**Important for OLMo users:** Note that due to version conflicts between deepspeed and vLLM, we cannot support OLMo inference and deepspeed within the same image (this will be fixed once deepspeed allows pydantic >= 2). To build a docker image suitable for inference/evaluation for OLMo, use:
```bash
docker build --build-arg CUDA=12.1.0 --build-arg TARGET=cudnn8-devel --build-arg DIST=ubuntu20.04 --build-arg REQUIRE=requirements-olmo.txt -f Dockerfile.olmo . -t <your tag here>
```

For training, you can use the previous image.

### Developing
Expand All @@ -89,8 +84,7 @@ make quality
├── open_instruct/ <- Source code (flat)
├── quantize/ <- Scripts for quantization
├── scripts/ <- Core training and evaluation scripts
├── Dockerfile <- Main Dockerfile
└── Dockerfile.olmo <- Dockerfile for OLMo users (version conflict currently.)
└── Dockerfile <- Dockerfile
```

## Training
Expand Down Expand Up @@ -203,7 +197,7 @@ python scripts/submit_eval_jobs.py \
--is_tuned --workspace tulu-3-results \
--preemptible \
--use_hf_tokenizer_template \
--beaker_image nathanl/open_instruct_olmo_auto \
--beaker_image nathanl/open_instruct_auto \
--upload_to_hf allenai/tulu-3-evals \
--run_oe_eval_experiments
```
Expand Down
2 changes: 1 addition & 1 deletion docs/safety-eval/safety.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ python scripts/submit_eval_jobs.py \
--is_tuned --workspace tulu-3-results \
--preemptible \
--use_hf_tokenizer_template \
--beaker_image nathanl/open_instruct_olmo_auto \
--beaker_image nathanl/open_instruct_auto \
--upload_to_hf allenai/tulu-3-evals \
--run_oe_eval_experiments \
--run_safety_evaluations
Expand Down
2 changes: 1 addition & 1 deletion open_instruct/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -668,7 +668,7 @@ def submit_beaker_eval_jobs(
location: str,
hf_repo_revision: str = "",
workspace: str = "tulu-3-results",
beaker_image: str = "nathanl/open_instruct_olmo_auto",
beaker_image: str = "nathanl/open_instruct_auto",
upload_to_hf: str = "allenai/tulu-3-evals",
) -> None:
command = f"""
Expand Down
41 changes: 0 additions & 41 deletions requirements-olmo.txt

This file was deleted.

12 changes: 4 additions & 8 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# TODO When updating flash-attn or torch in the future, make sure to update the version in the Dockerfile
torch<=2.3.0
torch==2.4.0
scipy
packaging
sentencepiece
datasets
deepspeed==0.14.4
deepspeed==0.15.0
accelerate==0.31.0
peft>=0.11.1
bitsandbytes>=0.41.1
Expand All @@ -22,16 +22,12 @@ termcolor
jsonlines
unidic-lite
einops
flash-attn==2.5.8 # should really only be in dockerfile. Local env often doesn't have GPUs
flash-attn==2.6.3 # should really only be in dockerfile. Local env often doesn't have GPUs
fire
alpaca-eval==0.6.2
# for human eval web app
flask
# Newer vLLM requires pydantic >= 2, but deepspeed requires pydantic < 2
# if we are not using olmo models, this is fine.
# once https://github.com/microsoft/DeepSpeed/pull/5167 is merged, we can
# update vLLM and remove the need for a separate olmo inference file
vllm>=0.4.1
vllm>=0.5.4 # for llama 3 + olmo compat.
openpyxl
# for ifeval
nltk==3.8.1
Expand Down
6 changes: 3 additions & 3 deletions scripts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,11 +61,11 @@ beaker secret write -w ai2/tulu-2-improvements "${beaker_whoami}_HF_TOKEN" xxxx
1. `submit_eval_jobs.py`: Submit eval jobs for tasks in `scripts/evals/`. For example, llama 3 tulu 2 and upload to the tulu-3 eval database.
```bash
# submit evals on a model in beaker dataset
python scripts/submit_eval_jobs.py --model_name llama_31_tulu_2_8b --location 01J4MGRSS3FM1J4E6XSH3459DK --is_tuned --workspace tulu-3-results --preemptible --use_hf_tokenizer_template --beaker_image nathanl/open_instruct_olmo_auto --upload_to_hf allenai/tulu-3-evals
python scripts/submit_eval_jobs.py --model_name llama_31_tulu_2_8b --location 01J4MGRSS3FM1J4E6XSH3459DK --is_tuned --workspace tulu-3-results --preemptible --use_hf_tokenizer_template --beaker_image nathanl/open_instruct_auto --upload_to_hf allenai/tulu-3-evals

# submit evals on a model in huggingface; note you need to 1) prepend the model name with `hf-` and 2) replace `--location` with the hf repo id
python scripts/submit_eval_jobs.py --model_name hf-llama_31_tulu_2_8b --location allenai/llama-3-tulu-2-8b --is_tuned --workspace tulu-3-results --preemptible --use_hf_tokenizer_template --beaker_image nathanl/open_instruct_olmo_auto --upload_to_hf allenai/tulu-3-evals
python scripts/submit_eval_jobs.py --model_name hf-llama_31_tulu_2_8b --location vwxyzjn/online_dpo_tulu_2 --is_tuned --workspace tulu-3-results --preemptible --use_hf_tokenizer_template --beaker_image nathanl/open_instruct_olmo_auto --upload_to_hf allenai/tulu-3-evals
python scripts/submit_eval_jobs.py --model_name hf-llama_31_tulu_2_8b --location allenai/llama-3-tulu-2-8b --is_tuned --workspace tulu-3-results --preemptible --use_hf_tokenizer_template --beaker_image nathanl/open_instruct_auto --upload_to_hf allenai/tulu-3-evals
python scripts/submit_eval_jobs.py --model_name hf-llama_31_tulu_2_8b --location vwxyzjn/online_dpo_tulu_2 --is_tuned --workspace tulu-3-results --preemptible --use_hf_tokenizer_template --beaker_image nathanl/open_instruct_auto --upload_to_hf allenai/tulu-3-evals


python scripts/submit_eval_jobs.py --model_name hf-online-dpo-llama-tulu2-longer --beaker_image costah/open_instruct_test --location vwxyzjn/online_dpo_vllm__allenai_llama-3-tulu-2-8b --hf_revision online_dpo_vllm__1__1724038538 --is_tuned --workspace tulu-3-results --preemptible --use_hf_tokenizer_template --upload_to_hf allenai/tulu-3-evals
Expand Down

0 comments on commit 7e49962

Please sign in to comment.