Skip to content

Conversation

@dpomykala
Copy link

Change Summary

The filecontent filter now uses a specialized extractor function if registered for the given format (currently PDF and DOCX). For all other formats the extract_txt function is used as a fallback.

This allows to use the filecontent filter for all plain-text files, regardless of the file extension.

Related issue number

Resolves #464.

Checklist

  • Tests for the changes exist and pass on CI
  • Documentation reflects the changes where applicable
  • Change is documented in CHANGELOG.md (if applicable)
  • My PR is ready to review

Use a specialized extractor function if registered for the given format
(currently PDF and DOCX). For all other formats use the `extract_txt`
function as a fallback.

Resolves tfeldmann#464.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for other text formats in the file content filter

1 participant