-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(rag): update rag params #765
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
WalkthroughThis pull request refactors the RAG (Retrieval-Augmented Generation) parameters by adjusting the similarity threshold and modifying the chunk size and overlap in the configuration. These changes aim to optimize the retrieval and processing of data. Changes
|
@@ -32,7 +32,7 @@ async def search_knowledge( | |||
space_id_list=[bot_id, repo_name], | |||
question=query, | |||
embedding_model_name=EmbeddingModelEnum.OPENAI, | |||
similarity_threshold=0.65, | |||
similarity_threshold=0.6, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lowering the similarity threshold from 0.65 to 0.6 may increase the number of results returned, potentially including less relevant matches. Ensure this change aligns with the intended retrieval quality.
@@ -60,8 +60,8 @@ async def reload_repo( | |||
repo_name=request.repo_name, auth_token=user.access_token | |||
), | |||
split_config=KnowledgeSplitConfig( | |||
chunk_size=500, | |||
chunk_overlap=100, | |||
chunk_size=1000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Increasing the chunk size and overlap may affect memory usage and processing time. Ensure that the system can handle these changes without performance degradation.
Codecov ReportAll modified and coverable lines are covered by tests ✅
|
No description provided.