Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/issue1230 > HtmlParser cannot recognize base64-encoded images - Fixed #1271

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

maayanb180
Copy link

@maayanb180 maayanb180 commented Feb 24, 2025

Description of the new Feature/Bugfix

HtmlParser cannot recognise base64-encoded images - Fixed

Related Issue: #1230

Unit-Tests for the new Feature/Bugfix

  • ImageTest.shouldReturnImageForBase64DataPNG
  • ImageTest.shouldReturnImageForBase64DataJPEG

Compatibilities Issues

Your real name

Maayan Bin Noon

Testing details

Any other details about how to test the new feature or bugfix?

public class Sandbox {

    public static void main(String[] args) throws IOException {
        createFile("""
                <html><head><title></title></head><body><img alt="" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==" style="width:36pt;height:36pt" /></body></html>
                """, 1);

        createFile("""
                <html><head><title></title></head><body>
                <img src="data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAZABkAAD/2wCEABQQEBkSGScXFycyJh8mMi4mJiYmLj41NTU1NT5EQUFBQUFBREREREREREREREREREREREREREREREREREREREQBFRkZIBwgJhgYJjYmICY2RDYrKzZERERCNUJERERERERERERERERERERERERERERERERERERERERERERERERERP/AABEIAAEAAQMBIgACEQEDEQH/xABMAAEBAAAAAAAAAAAAAAAAAAAABQEBAQAAAAAAAAAAAAAAAAAABQYQAQAAAAAAAAAAAAAAAAAAAAARAQAAAAAAAAAAAAAAAAAAAAD/2gAMAwEAAhEDEQA/AJQA9Yv/2Q=="/>
                </body></html>
                """, 2);
    }

    private static void createFile(String htmlContent, int i) {
        try (Document document = new Document()) {
            PdfWriter.getInstance(document, Files.newOutputStream(Paths.get("parseBase64Image"+i+".pdf")));
            document.open();
            HtmlParser.parse(document,  new InputSource(new StringReader(htmlContent)));
        } catch (DocumentException | IOException de) {
            System.err.println(de.getMessage());
        }
    }
}

@maayanb180 maayanb180 changed the title Bugfix/issue1230 Bugfix/issue1230 > HtmlParser cannot recognise base64-encoded images - Fixed Feb 25, 2025
@maayanb180 maayanb180 changed the title Bugfix/issue1230 > HtmlParser cannot recognise base64-encoded images - Fixed Bugfix/issue1230 > HtmlParser cannot recognize base64-encoded images - Fixed Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant