Skip to content

[Question]: pipeline #13491

@Kyrie5e

Description

@Kyrie5e

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

Describe your problem

I want to customize a parser that converts PDF, TXT, and Word files to Markdown, then performs header-based chunking, and finally embeds the data. Is this workflow possible? Or are there other approaches to automatically convert all files to Markdown before parsing?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions