'Flattening' Annotation text into the searchable of a document #1181
              
                Unanswered
              
          
                  
                    
                      petertennis
                    
                  
                
                  asked this question in
                Looking for help
              
            Replies: 2 comments 2 replies
-
| Here is an example for reference - notice how the table in the bottom of the document has lots of Annotations but you cannot search the text easily via Ctrl-F etc | 
Beta Was this translation helpful? Give feedback.
                  
                    0 replies
                  
                
            -
| Hm, I understand. 
 | 
Beta Was this translation helpful? Give feedback.
                  
                    2 replies
                  
                
            
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
        
    
Uh oh!
There was an error while loading. Please reload this page.
-
I am processing a diverse set of architectural documents with the aim of making them more searchable in PDF viewers - some have native PDF text, some don't, often they have partial native text. So I OCR the page, and add the text elements which are not already present to the page (I check for overlaps etc to accomplish this)
Now I notice that some of my documents have Annotations that effectively contain the text for a particular part of the page. The quality of this content is often better than my OCR results and the Annotation bounding boxes seem to be in the right place to line up with the visual text.
So.......I would like to push this text into the natively searchable layer. Ideally, this would be accomplished without removing them as Annotations. Is there any function to do this automatically in PyMuPDF? Or any other thoughts on what I am trying to accomplish?
Thanks for reading!
Beta Was this translation helpful? Give feedback.
All reactions