You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ran OCR software on them to identify the text of individual entries
ran NER software on them (based on a trained dataset) to extract semantically relevant fields.
This code is currently functional from the command line, for someone who has access to the images on their local filesystem. However, we anticipate the functionality being used by volunteers who do not have such access or skills, so we need a web-based application (built in Ruby on Rails to be consistent with the rest of our tech stack) that can manage the process.
It should
Allow the end user to upload image files and store them on the filesystem
Present the existing image files on the filesystem for the user to browse
Allow the user to launch the OCR/NER process from the user interface
Actually run the scripts mentioned above (which rely on Tesseract and the Python Spacy library), logging any output and errors
Allow the user to view progress and review quality of the process.
To develop tool based on process created in last SoC.
The text was updated successfully, but these errors were encountered: