Commit 39ed399
Feat/train tool agent (#17)
* train tool agent
* Feat/add search scrape tool (#14)
* migrate search, scrape tool into verl and add code to spin up retrieval server
* add debug code to train tool agent
* Update async_sglang_server.py
---------
Co-authored-by: nguyenhoangthuan99 <35255081+nguyenhoangthuan99@users.noreply.github.com>
* add data code (#15)
* add data code
* push reward code
* add system prompt and user prompt
* fix prompt
* fix prompt
* update to train 30b model
* fix training bug
* add reward remove repeate tool call and bulk tool calls
---------
Co-authored-by: bachvudinh <bachvudinh02@gmail.com>
Co-authored-by: bachvudinh <89349141+bachvudinh@users.noreply.github.com>1 parent 63f2ffb commit 39ed399
File tree
5 files changed
+78
-9
lines changed- examples/vllm_multiturn/config
- verl
- trainer/config/rollout
- utils
- dataset
- reward_score/jan_v2_reward
5 files changed
+78
-9
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
24 | 27 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
160 | 160 | | |
161 | 161 | | |
162 | 162 | | |
163 | | - | |
| 163 | + | |
164 | 164 | | |
165 | 165 | | |
166 | | - | |
| 166 | + | |
167 | 167 | | |
168 | 168 | | |
169 | 169 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | | - | |
| 49 | + | |
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
54 | | - | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
55 | 57 | | |
56 | 58 | | |
57 | 59 | | |
58 | 60 | | |
59 | | - | |
| 61 | + | |
60 | 62 | | |
61 | 63 | | |
62 | 64 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
2 | 2 | | |
| 3 | + | |
3 | 4 | | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
4 | 16 | | |
5 | 17 | | |
6 | 18 | | |
| |||
39 | 51 | | |
40 | 52 | | |
41 | 53 | | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
42 | 66 | | |
43 | 67 | | |
44 | 68 | | |
| |||
48 | 72 | | |
49 | 73 | | |
50 | 74 | | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
51 | 92 | | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
52 | 101 | | |
53 | 102 | | |
54 | 103 | | |
| |||
57 | 106 | | |
58 | 107 | | |
59 | 108 | | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
| 6 | + | |
6 | 7 | | |
7 | 8 | | |
8 | 9 | | |
| |||
114 | 115 | | |
115 | 116 | | |
116 | 117 | | |
| 118 | + | |
117 | 119 | | |
118 | 120 | | |
119 | 121 | | |
120 | | - | |
| 122 | + | |
| 123 | + | |
121 | 124 | | |
122 | 125 | | |
123 | 126 | | |
| |||
0 commit comments