-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
It is necessary to upgrade the weaviate client. #8034
Comments
I found a similar discussion regarding the Weaviate database version and its support in Dify. You might find it relevant: Why doesn't Weaviate database use the latest version, still using version 1.19, and does Dify currently support higher versions of Weaviate? [1]. Currently, the Dify codebase uses Weaviate client version For Chinese full-text search, the current implementation includes configurations for a Chinese lexer and uses Jieba for text segmentation [5]. The Upgrading to Weaviate version |
I agree that we should upgrade the Weaviate version. Version 1.2x supports different compression methods and gRPC calls, which can significantly improve performance. |
I agree with this suggestion.
def _default_schema(self, index_name: str) -> dict:
return {
"class": index_name,
"properties": [
{
"name": "text",
"dataType": ["text"],
"tokenization": "gse", # <- added this
}
],
} When implementing, we also need to create a UI that switches the tokenizer depending on whether the document to be registered is in a language suitable for GSE. |
https://weaviate.io/developers/weaviate/search/similarity#named-vectors Named-vectors is also needed. |
Hi, @jiandanfeng. I'm Dosu, and I'm helping the Dify team manage their backlog. I'm marking this issue as stale. Issue Summary
Next Steps
Thank you for your understanding and contribution! |
I'm focused on #12223. |
Self Checks
1. Is this request related to a challenge you're experiencing? Tell me about your story.
It is necessary to upgrade the weaviate client. When using Chinese full-text search, the current weaviate version does not support Chinese word segmentation, so the effect is not good. It needs to be upgraded to version 1.24. The gse or trigram word segmentation method is used to support Chinese word segmentation and improve the ability of weaviate Chinese full-text search.
weaviate tokenization link: https://weaviate.io/developers/weaviate/config-refs/schema#tokenization
weaviate client update link: https://weaviate.io/developers/weaviate/client-libraries/python/v3_v4_migration#installation
2. Additional context or comments
No response
3. Can you help us with this feature?
The text was updated successfully, but these errors were encountered: