Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

表格识别产线可以指定markdown输出吗 #2750

Open
DietDietDiet opened this issue Dec 31, 2024 · 5 comments
Open

表格识别产线可以指定markdown输出吗 #2750

DietDietDiet opened this issue Dec 31, 2024 · 5 comments
Assignees

Comments

@DietDietDiet
Copy link

from paddlex import create_pipeline

pipeline = create_pipeline(pipeline="table_recognition")

output = pipeline.predict("01.png")
for res in output:
res.print() ## 打印预测的结构化输出
res.save_to_img("./output/") ## 保存img格式结果
res.save_to_xlsx("./output/") ## 保存表格格式结果
res.save_to_html("./output/") ## 保存html结果

使用表格识别产线,可以拿到markdown格式的输出结果吗,或者写入excel后可复制的string,便于服务化返回,谢谢!

@Bobholamovic
Copy link
Member

你好,暂时不支持返回markdown的结果,但许多markdown渲染引擎都支持内嵌HTML,不知道save_to_html保存的HTML结果是否能满足你的需求?实际上,paddlex官方提供的服务化部署方案中,表格识别产线的服务也是会返回html string的。

@DietDietDiet
Copy link
Author

请问一下这个output的结构是怎样的啊,是表格的结果都在table_result这个key里面吗,里面是个列表的原因是为了兼容多张图片输入还是一张图片里有多个表格呢

@Bobholamovic
Copy link
Member

pipeline.predict本身返回一个迭代器,迭代器中元素的数量和输入图片数量相同,每个元素中包含对应图片中的多个表格。

@DietDietDiet
Copy link
Author

get~ 感谢! 最后想请教下现在table_recognition默认的配置已经是识别效果最准的了吗,还有没有慢一些但是更准的配置呢~

@Bobholamovic
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants