Skip to content

PDFs: Switch to DomPDF & use Noto Sans (plus KR, SC, and Arabic) for Invoices & Docs.#1552

Open
dd32 wants to merge 20 commits intoproductionfrom
utf8-invoices
Open

PDFs: Switch to DomPDF & use Noto Sans (plus KR, SC, and Arabic) for Invoices & Docs.#1552
dd32 wants to merge 20 commits intoproductionfrom
utf8-invoices

Conversation

@dd32
Copy link
Copy Markdown
Member

@dd32 dd32 commented Oct 31, 2025

Fixes #1433

It seems that the only thing missing on Invoices to make non-western characters work is that a) The system doesn't have any chinese/korean/japanese fonts installed and b) isn't specifying a webfont that supports these laguages.

Screenshots

Attempting to render #1433 (comment)

Before After
Screenshot 2025-10-31 at 5 09 18 pm Screenshot 2025-10-31 at 5 01 08 pm

The Simplified Chinese still has a few missing characters, but much better than before. I'm not sure why these are missing, as the font appears to support those unicode characters.. It may be something in how I'm testing it.

@dd32 dd32 added [Type] Bug [Component] Docs Generate official docs for WordCamps (visa letters, sponsorship agreements) [Component] CampTix Attendee Invoices labels Oct 31, 2025
@dd32
Copy link
Copy Markdown
Member Author

dd32 commented Oct 31, 2025

Noting that there's also Noto Sans Arabic. Perhaps we should also switch to Noto Sans for western characters too..

@dd32
Copy link
Copy Markdown
Member Author

dd32 commented Oct 31, 2025

Maybe Arabic works fine as-is, with this PR: (Those chinese characters are still annoying me, and that's why this isn't being merged at 6pm on a Friday)

Screenshot 2025-10-31 at 5 31 37 pm

<meta charset="UTF-8">
<link href="http://fonts.googleapis.com/css?family=Open+Sans:300,600,700" rel="stylesheet" type="text/css" />
<?php // phpcs:ignore WordPress.WP.EnqueuedResources.NonEnqueuedStylesheet -- This is rendered by wkhtmltopdf, so additional libraries are included directly. ?>
<link href="https://fonts.googleapis.com/css2?family=Open+Sans:wght@300..700&family=Noto+Sans+KR:wght@300..700&family=Noto+Sans+SC:wght@300..700" rel="stylesheet" type="text/css" />
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dd32

Thanks for your PR to resolve my issue!

How about swap font resolve order?

/* Current... */
font-family: 'Noto Sans KR', 'Noto Sans SC', 'Open Sans', sans-serif;

/* to this */
font-family: 'Noto Sans SC', 'Noto Sans KR', 'Open Sans', sans-serif;

KR might use the same glyph points as of Chinese.

Note: I cannot reproduce same character missing(we call it "TOFU").

@ryelle
Copy link
Copy Markdown
Contributor

ryelle commented Oct 31, 2025

It looks like this isn't loading Open Sans at all now, because the Notos have latin characters, so it never falls back to Open Sans. I'd recommend updating the font stack to 'Noto Sans', 'Noto Sans SC', 'Noto Sans KR', sans-serif.

… Actually, that won't work either. I don't think wkhtmltopdf supports multiple fonts. If you do that, everything is "tofu". The tofu in Simplified Chinese goes away if you put Noto Sans SC first, but then Korean breaks. So it appears to only load the first font.

Noto Sans first Noto Sans SC first Noto Sans KR first
Screenshot 2025-10-31 at 11 04 48 AM Screenshot 2025-10-31 at 11 04 56 AM Screenshot 2025-10-31 at 11 06 28 AM

Maybe switching to something like dompdf would fix that, they have addressed CJK font support in this issue.

@fumikito
Copy link
Copy Markdown

fumikito commented Nov 1, 2025

I've tried on my local machine, but because of network error, wkhtmltopdf failed to download google fonts.

QSslSocket: cannot resolve SSL_load_error_strings
Exit with code 1 due to network error: UnknownNetworkError

I also tried another approach — installing fonts-noto-cjk inside the Docker container — and that resolved the garbled character issue.

sudo apt install fonts-noto-cjk
スクリーンショット 2025-11-01 17 11 30

The complete diff of docker file is below:

before__Dockerfile.php-fpm | Dockerfile.php-fpm – 2 Additions.pdf

@dd32
Copy link
Copy Markdown
Member Author

dd32 commented Nov 4, 2025

It looks like this isn't loading Open Sans at all now, because the Notos have latin characters,

I thought that might be the case...

… Actually, that won't work either. I don't think wkhtmltopdf supports multiple fonts.

This too, except that as far as I could tell it definitely does because KR doesn't work without the KR font, and Chinese didn't work without the SC font, but with both they did load...

Maybe switching to something like dompdf would fix that,

I did consider that, but I was hopeful that a quick font-change here would suffice :)

The complete diff of docker file is below:

@fumikito Unfortunately we don't use the included Dockerfile in production, so it's a little more involved than just the change here :(
While I can get systems alterations made, I was thinking if that's the case, might be worth moving to one of the PHP-based projects linked above instead.

@ryelle
Copy link
Copy Markdown
Contributor

ryelle commented Nov 4, 2025

This too, except that as far as I could tell it definitely does because KR doesn't work without the KR font, and Chinese didn't work without the SC font, but with both they did load...

Both SC and KR fonts have Latin characters included, and KR probably has a decent set of glyphs for Chinese characters (they're used in Korean, but rarely). That's why most of the SC text loads when KR font is first. But the SC font, while it includes Latin, does not include the Hangul (Korean characters) — so the KR text does not load.

This explains the random "tofu" in SC — this is a screenshot of the KR font, and the characters in blue are Noto Sans KR, the rest are system fallbacks (not included in the font).

Screenshot 2025-11-03 at 8 33 01 PM

(I'm not sure why 区 works when it shouldn't, maybe a font version difference)

@dd32
Copy link
Copy Markdown
Member Author

dd32 commented Nov 4, 2025

🤷 I guess I was very wrong, or improperly tested :)

Let's try dompdf and if that doesn't work maybe we can look at these alternatives..

@ryelle
Copy link
Copy Markdown
Contributor

ryelle commented Nov 4, 2025

After the subsetting work, I'm pretty familiar with these fonts 😅

@dd32 dd32 changed the title Load Noto Sans KR and Noto Sans SC for Invoices & Docs. PDFs: Switch to DomPDF & use Noto Sans (plus KR, SC, and Arabic) for Invoices & Docs. Nov 4, 2025
@dd32 dd32 changed the title PDFs: Switch to DomPDF & use Noto Sans (plus KR, SC, and Arabic) for Invoices & Docs. PDFs: Switch to DomPDF & use Noto Sans (plus KR, SC, JP, and Arabic) for Invoices & Docs. Nov 4, 2025
@dd32 dd32 changed the title PDFs: Switch to DomPDF & use Noto Sans (plus KR, SC, JP, and Arabic) for Invoices & Docs. PDFs: Switch to DomPDF & use Noto Sans (plus KR, SC, and Arabic) for Invoices & Docs. Nov 4, 2025
@dd32
Copy link
Copy Markdown
Member Author

dd32 commented Nov 4, 2025

With the PR as it is...

Screenshot 2025-11-04 at 3 48 44 pm

I've found that with DomPDF anything non-ascii really needs a font loaded, as the default font will ? everything non-western-latin.

DomPDF does bundle the DejaVu Sans font, but that didn't seem to provide any benefits here.

Initially I also had the JP and Devanagari fonts loaded, but I realised I had done so incorrectly.. Those fonts might still be worth loading.

DomPDF relies upon a local font-cache path, webfonts aren't being fetched on every PDF render, only the first one.. But as a downside, those fonts are cached locally even if it's not used in the document.

@dd32
Copy link
Copy Markdown
Member Author

dd32 commented Nov 17, 2025

Took forever to figure out why this wasn't working with invoices.. Turns out that if an embedded image is 404 / broken, then DomPdf seems to enter an endless forever loop. It seems to be that the broken-image fallback is well broken.

Additionally; Before merging some performance profiling should be done to figure out if this is going to be too slow..
For Docs this is fine, since it's generated interactively. For tickets, this happens during the checkout flow.

For invoices, I suspect the better solution will be to remove invoice generation from during the checkout, and moving it to be able to be downloaded from the Successful purchase landing page (ala #1555)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Component] CampTix Attendee Invoices [Component] Docs Generate official docs for WordCamps (visa letters, sponsorship agreements) [Type] Bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Invoice PDF drops non-western characters

3 participants