Skip to content

Make an automated regex generator given just emails #49

Open
@Divide-By-0

Description

@Divide-By-0

If I attach two emails (i.e. raw .eml bodies), either two from the same template or one from an old template and one from a new template, we should be able to auto-generate a regex by:

  • detecting the maximum overlap in the raw text, and constraining those match
  • detecting the parts that differ and constraining them to the type of character correctly (i.e. a contiguous sequence with a single @ is constrained via email address regex, it autodetects mixes of decimal and hex, ascii, floats, etc)
  • allowing the user to highlight what the match group want to reveal via i.e. zkregex.com/tool, and loosening the constraints on the rest of the unmatched text

This will be critical for partners like zkp2p to both support new emails and rapidly adapt to template changes. As it is a whole, integrated project, there is a much larger bounty for this project.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions