Skip to content

Commit 892624f

Browse files
authored
CodGen Examples using-RAG-and-Agents (#1757)
Signed-off-by: Mustafa <[email protected]>
1 parent 8b7cb35 commit 892624f

18 files changed

+1518
-233
lines changed

CodeGen/README.md

Lines changed: 73 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Code Generation Application
22

3-
Code Generation (CodeGen) Large Language Models (LLMs) are specialized AI models designed for the task of generating computer code. Such models undergo training with datasets that encompass repositories, specialized documentation, programming code, relevant web content, and other related data. They possess a deep understanding of various programming languages, coding patterns, and software development concepts. CodeGen LLMs are engineered to assist developers and programmers. When these LLMs are seamlessly integrated into the developer's Integrated Development Environment (IDE), they possess a comprehensive understanding of the coding context, which includes elements such as comments, function names, and variable names. This contextual awareness empowers them to provide more refined and contextually relevant coding suggestions.
3+
Code Generation (CodeGen) Large Language Models (LLMs) are specialized AI models designed for the task of generating computer code. Such models undergo training with datasets that encompass repositories, specialized documentation, programming code, relevant web content, and other related data. They possess a deep understanding of various programming languages, coding patterns, and software development concepts. CodeGen LLMs are engineered to assist developers and programmers. When these LLMs are seamlessly integrated into the developer's Integrated Development Environment (IDE), they possess a comprehensive understanding of the coding context, which includes elements such as comments, function names, and variable names. This contextual awareness empowers them to provide more refined and contextually relevant coding suggestions. Additionally Retrieval-Augmented Generation (RAG) and Agents are parts of the CodeGen example which provide an additional layer of intelligence and adaptability, ensuring that the generated code is not only relevant but also accurate, efficient, and tailored to the specific needs of the developers and programmers.
44

55
The capabilities of CodeGen LLMs include:
66

@@ -28,7 +28,7 @@ config:
2828
rankSpacing: 100
2929
curve: linear
3030
themeVariables:
31-
fontSize: 50px
31+
fontSize: 25px
3232
---
3333
flowchart LR
3434
%% Colors %%
@@ -37,34 +37,56 @@ flowchart LR
3737
classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
3838
classDef invisible fill:transparent,stroke:transparent;
3939
style CodeGen-MegaService stroke:#000000
40-
4140
%% Subgraphs %%
42-
subgraph CodeGen-MegaService["CodeGen MegaService "]
41+
subgraph CodeGen-MegaService["CodeGen-MegaService"]
4342
direction LR
44-
LLM([LLM MicroService]):::blue
43+
EM([Embedding<br>MicroService]):::blue
44+
RET([Retrieval<br>MicroService]):::blue
45+
RER([Agents]):::blue
46+
LLM([LLM<br>MicroService]):::blue
4547
end
46-
subgraph UserInterface[" User Interface "]
48+
subgraph User Interface
4749
direction LR
48-
a([User Input Query]):::orchid
49-
UI([UI server<br>]):::orchid
50+
a([Submit Query Tab]):::orchid
51+
UI([UI server]):::orchid
52+
Ingest([Manage Resources]):::orchid
5053
end
5154
55+
CLIP_EM{{Embedding<br>service}}
56+
VDB{{Vector DB}}
57+
V_RET{{Retriever<br>service}}
58+
Ingest{{Ingest data}}
59+
DP([Data Preparation]):::blue
60+
LLM_gen{{TGI Service}}
61+
GW([CodeGen GateWay]):::orange
5262
53-
LLM_gen{{LLM Service <br>}}
54-
GW([CodeGen GateWay<br>]):::orange
55-
63+
%% Data Preparation flow
64+
%% Ingest data flow
65+
direction LR
66+
Ingest[Ingest data] --> UI
67+
UI --> DP
68+
DP <-.-> CLIP_EM
5669
5770
%% Questions interaction
5871
direction LR
5972
a[User Input Query] --> UI
6073
UI --> GW
6174
GW <==> CodeGen-MegaService
75+
EM ==> RET
76+
RET ==> RER
77+
RER ==> LLM
6278
6379
6480
%% Embedding service flow
6581
direction LR
82+
EM <-.-> CLIP_EM
83+
RET <-.-> V_RET
6684
LLM <-.-> LLM_gen
6785
86+
direction TB
87+
%% Vector DB interaction
88+
V_RET <-.->VDB
89+
DP <-.->VDB
6890
```
6991

7092
## 🤖 Automated Terraform Deployment using Intel® Optimized Cloud Modules for **Terraform**
@@ -94,12 +116,12 @@ Currently we support two ways of deploying ChatQnA services with docker compose:
94116

95117
By default, the LLM model is set to a default value as listed below:
96118

97-
| Service | Model |
98-
| ------------ | --------------------------------------------------------------------------------------- |
99-
| LLM_MODEL_ID | [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) |
119+
| Service | Model |
120+
| ------------ | ----------------------------------------------------------------------------------------- |
121+
| LLM_MODEL_ID | [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) |
100122

101-
[Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) may be a gated model that requires submitting an access request through Hugging Face. You can replace it with another model.
102-
Change the `LLM_MODEL_ID` below for your needs, such as: [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct)
123+
[Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) may be a gated model that requires submitting an access request through Hugging Face. You can replace it with another model for m.
124+
Change the `LLM_MODEL_ID` below for your needs, such as: [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct), [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct)
103125

104126
If you choose to use `meta-llama/CodeLlama-7b-hf` as LLM model, you will need to visit [here](https://huggingface.co/meta-llama/CodeLlama-7b-hf), click the `Expand to review and access` button to ask for model access.
105127

@@ -134,22 +156,44 @@ To set up environment variables for deploying ChatQnA services, follow these ste
134156

135157
#### Deploy CodeGen on Gaudi
136158

137-
Find the corresponding [compose.yaml](./docker_compose/intel/hpu/gaudi/compose.yaml).
159+
Find the corresponding [compose.yaml](./docker_compose/intel/hpu/gaudi/compose.yaml). User could start CodeGen based on TGI or vLLM service:
138160

139161
```bash
140162
cd GenAIExamples/CodeGen/docker_compose/intel/hpu/gaudi
141-
docker compose up -d
163+
```
164+
165+
TGI service:
166+
167+
```bash
168+
docker compose --profile codegen-gaudi-tgi up -d
169+
```
170+
171+
vLLM service:
172+
173+
```bash
174+
docker compose --profile codegen-gaudi-vllm up -d
142175
```
143176

144177
Refer to the [Gaudi Guide](./docker_compose/intel/hpu/gaudi/README.md) to build docker images from source.
145178

146179
#### Deploy CodeGen on Xeon
147180

148-
Find the corresponding [compose.yaml](./docker_compose/intel/cpu/xeon/compose.yaml).
181+
Find the corresponding [compose.yaml](./docker_compose/intel/cpu/xeon/compose.yaml). User could start CodeGen based on TGI or vLLM service:
149182

150183
```bash
151184
cd GenAIExamples/CodeGen/docker_compose/intel/cpu/xeon
152-
docker compose up -d
185+
```
186+
187+
TGI service:
188+
189+
```bash
190+
docker compose --profile codegen-xeon-tgi up -d
191+
```
192+
193+
vLLM service:
194+
195+
```bash
196+
docker compose --profile codegen-xeon-vllm up -d
153197
```
154198

155199
Refer to the [Xeon Guide](./docker_compose/intel/cpu/xeon/README.md) for more instructions on building docker images from source.
@@ -170,6 +214,15 @@ Two ways of consuming CodeGen Service:
170214
-d '{"messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."}'
171215
```
172216

217+
If the user wants a CodeGen service with RAG and Agents based on dedicated documentation.
218+
219+
```bash
220+
curl http://localhost:7778/v1/codegen \
221+
-H "Content-Type: application/json" \
222+
-d '{"agents_flag": "True", "index_name": "my_API_document", "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."}'
223+
224+
```
225+
173226
2. Access via frontend
174227

175228
To access the frontend, open the following URL in your browser: http://{host_ip}:5173.
Loading
26 KB
Loading
50.8 KB
Loading
34.1 KB
Loading

0 commit comments

Comments
 (0)