Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 提取信息大小写错误 #103

Open
2 tasks done
jinghong-6 opened this issue Feb 4, 2024 · 2 comments
Open
2 tasks done

[BUG] 提取信息大小写错误 #103

jinghong-6 opened this issue Feb 4, 2024 · 2 comments

Comments

@jinghong-6
Copy link

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

  • 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

  • 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

在进行信息提取时,有一条信息的英文字母应该是大写,但是得出的结果只有首字母大写,多尝试了几次最多就前两个字母是大写的
G%`9F}YJ)HH~R0OUZJ$D42C

期望行为 | Expected Behavior

S})%Q%SNGJ~)ASM90$9BU6T
源文件是word
可能是识别成单词了,不知道如何解决

运行环境 | Environment

- OS:Ubuntu 23.10
- NVIDIA Driver:
- CUDA:
- Docker Compose:
- NVIDIA GPU Memory:16GB

QAnything日志 | QAnything logs

No response

复现方法 | Steps To Reproduce

No response

备注 | Anything else?

No response

@jinghong-6
Copy link
Author

过了一会又可以了,我看了数据来源的数据预览,确实是SOB三个大写字母
大写变小写这是大模型私自处理的吧,还有就是
后面把O识别成0了
这问题应该怎么解决呢

@jinghong-6
Copy link
Author

目前的模型是7b,之前使用ChatGLM3-6B+langchain-chatchat并不会出现这种情况,用word的话再怎么胡说八道也不至于把源内容给改了,还有就是能否自主选择LLM或者Embedding,增加更多的适配

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant