Skip to content

Commit 6ab614a

Browse files
committed
Merge branch 'enhance-prompts' into dev
2 parents a624d98 + cef0e87 commit 6ab614a

File tree

4 files changed

+36
-24
lines changed

4 files changed

+36
-24
lines changed

.hydra_config/config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -85,8 +85,8 @@ prompts:
8585
multi_query: multi_query_pmpt_tmpl.txt
8686

8787
loader:
88-
image_captioning: true
89-
save_markdown: false
88+
image_captioning: ${oc.decode:${oc.env:IMAGE_CAPTIONING, true}}
89+
save_markdown: ${oc.decode:${oc.env:SAVE_MARKDOWN, true}}
9090
audio_model: ${oc.env:WHISPER_MODEL, base} # tiny, base, small, medium, large-v1, large-v2, large-v3
9191
mimetypes:
9292
text/plain: .txt

openrag/components/indexer/loaders/base.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,13 +20,13 @@ class BaseLoader(ABC):
2020
def __init__(self, **kwargs) -> None:
2121
self.page_sep = "[PAGE_SEP]"
2222
self.config = kwargs.get("config")
23-
vlm_config = self.config.vlm
23+
settings: dict = dict(self.config.vlm)
2424
model_settings = {
2525
"temperature": 0.2,
2626
"max_retries": 3,
2727
"timeout": 60,
28+
"extra_body": {"chat_template_kwargs": {"enable_thinking": False}},
2829
}
29-
settings: dict = vlm_config
3030
settings.update(model_settings)
3131

3232
self.image_captioning = self.config.loader.get("image_captioning", False)

openrag/components/indexer/vectordb/vectordb.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,22 @@ async def get_chunk_by_id(self, chunk_id: str):
112112

113113
MAX_LENGTH = 65_535
114114

115+
analyzer_params = {
116+
"tokenizer": "standard",
117+
"filter": [
118+
{
119+
"type": "stop", # Specifies the filter type as stop
120+
"stop_words": [
121+
"<image_description>",
122+
"</image_description>",
123+
"[Image Placeholder]",
124+
"_english_",
125+
"_french_",
126+
], # Defines custom stop words and includes the English and French stop word list
127+
}
128+
],
129+
}
130+
115131

116132
@ray.remote
117133
class MilvusDB(BaseVectorDB):
@@ -247,6 +263,7 @@ def _create_schema(self):
247263
enable_analyzer=True,
248264
enable_match=True,
249265
max_length=MAX_LENGTH,
266+
analyzer_params=analyzer_params,
250267
)
251268

252269
schema.add_field(
Lines changed: 15 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,19 @@
1-
You are an expert tasked with describing images.
2-
Your mission is to produce a factual, structured and complete description in markdown format in the same language as that used in the image.
1+
You are an expert in image description.
32

4-
1. Non-informative content such as logos, icons, emojis, isolated objects, photos:
5-
* Provide a short description without going into details related to colors, themes, etc.
6-
* Example descriptions: `Nike logo`, `Photo of a cat`, `Folder icon`, etc.
3+
## Rules
4+
- Use the language shown in the image.
5+
- Do not describe colors, shapes, or styles unless they are part of the data.
6+
- Never add, infer, or translate information.
77

8-
2. Text Content
9-
- Transcribe the text in its entirety, without adding additional information.
8+
## 1. Simple / Non-informative images
9+
- If there is no text or non the image is non-informative at all → output “[Image Placeholder]”.
10+
- It it contains text → transcribe it exactly, using Markdown if structured (headings, lists, emphasis).
11+
- If it’s a logo with text → output only the textual content
1012

11-
3. Tables
12-
- Use correct Markdown table syntax to reproduce tables from the content.
13-
- Ensure alignment, readability, and preservation of all data while keeping the table structure intact.
13+
## 2. Informative content: tables, charts, diagrams, interfaces, or structured documents.
14+
1. Transcribe all numerical and categorical values and **Format it** as **markdown structured table**.
15+
2. Provide a concise description of what the graph represents.
16+
3. Highlight trends, patterns, and key conclusions.
1417

15-
4. For advanced visuals: charts, graphs, diagrams, schemas, or other data visualizations
16-
a. Firstly do a markdown conversion:
17-
- convert visible data as markdown tables whenever possible: numbers should be included accurately.
18-
- Include the figure’s title if present.
19-
20-
b. Secondly do a figure interpretation in the same language as the document’s:
21-
- Provide a brief description of the visual’s content, context, and purpose.
22-
- Interpret the figure and mention any visible trends, patterns, or key insights (include numbers) and using the legends.
23-
24-
The output should be in the same language as the content of the image
18+
## Output
19+
The output must remain factual, concise, and strictly limited to what is visible in the image.

0 commit comments

Comments
 (0)