You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+15-6Lines changed: 15 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -142,13 +142,22 @@ bash ./setup_env.sh
142
142
# NOTE: Some models are guarded on huggingface, so you will need to visit their model page, accept the EULA and enter the huggingface Access Token to your account when prompted. See section "Requirements" for more details.
143
143
```
144
144
145
-
Before running the experiments, you need to download the models and datasets.
146
-
To download the models and datasets, run the following command:
145
+
> Important note: Before running the experiments, you need to download the models and datasets used for the experiments.
146
+
147
+
We provide a script to download the required dataset and models for our experiments. This script must be run before starting the experiments.
148
+
You may specify models to download by passing the `models` paramater.
To warm up, we start by reproducing the result for synthesis of the smallest model (Gemma 2 2B) and the MBPP dataset. To avoid using busy GPUs in a shared setting, use command `nvidia-smi` to check which GPUs are free. Then specify the IDs of GPUs you want to use by setting the `CUDA_VISIBLE_DEVICES` environment variable. If you want to use GPU 0 and 1, run the following command:
This reproduces the results for Gemma-2B on the synthesis task on MBPP.
161
170
The experiment should finish within approximately 4 hours on a single GPU.
162
-
The results of the experiment (and all other results) will be stored in `experiments/main/results` in an appropriately named `jsonl` file, in this concrete example `experiments/main/results/mbpp_google_gemma-2-2b-it_s=0_t=1_synth_nc.jsonl` and `..._c.jsonl` for the unconstrained and type-constrained variants respectively.
171
+
The results of the experiment (and all other results) will be stored in `experiments/main/results` in an appropriately named `jsonl` file. The general schema is `experiments/main/results/<subset>_<model>_s=<seed>_t=<temperature>_<task>_<constrained>.jsonl`. In this concrete example `experiments/main/results/mbpp_google_gemma-2-2b-it_s=0_t=1_synth_nc.jsonl` and `..._c.jsonl` for the unconstrained and type-constrained variants respectively.
163
172
164
-
> The experiment runs can be cancelled at any time, intermediate results are stored in the `results` folder. Upon restarting, the script will automatically pick up the last completed instance and continue from there. It may happen that running tasks daemonize and continue running (check `nvidia-smi`). Make sure to kill them manually before restarting.
173
+
> The experiment runs can be cancelled at any time, intermediate results are stored in the `jsonl` files. Upon restarting, the script will automatically pick up the last completed instance and continue from there. It may happen that running tasks daemonize and continue running (check `nvidia-smi`). Make sure to kill them manually before restarting.
165
174
166
175
Our experiment script automatically distributes jobs over indicated GPUs.
167
176
The script then repeatedly queries whether running jobs are completed and new GPUs are available. You will therefore see something like the following ouput:
@@ -290,4 +299,4 @@ The type reachability algorithm is implemented in `typesafe_llm/parser/types_ts.
290
299
291
300
The automaton for statements is defined in `typesafe_llm/automata/parser_ts.py` in the class `StatementParserState`.
292
301
It handles the constraining for valid return types.
293
-
The automaton for the entire program is defined in `typesafe_llm/automata/parser_ts.py` in the class `ProgramParserState`.
302
+
The automaton for the entire program is defined in `typesafe_llm/automata/parser_ts.py` in the class `ProgramParserState`.
0 commit comments