Skip to content

Conversation

@SukkaW
Copy link
Member

@SukkaW SukkaW commented May 31, 2025

check list

  • Add test cases for the changes.
  • Passed the CI test.

Description

Recently, I have created the fastest HTMl escaping with plain JavaScript: fast-escape-html. It is even faster than the Rust counterpart with real-world HTML.

hexo-util can't use fast-escape-url directly since hexo-utils needs to escape more characters (to work with swig/mustache.js/pug template languages), and also needs to avoid double escape.

I have managed to modify fast-escape-url for hexo-util, and it still results in significant performance improvements. I have also slightly modified unescapeHTML to use some performance techniques from fast-escape-url as well.

The escapeTML benchmark results are as follows.

TL; DR: due to extra checks and look-ahead, the new escapeHTML will never beat my fast-escape-url, but it is 145% faster than we currently have, and also beats lodash.escape and escape-goat (that don't even check double escape, and they escape fewer symbols).

clk: ~3.11 GHz
cpu: Apple M2 Max
runtime: node 22.15.1 (arm64-darwin)

benchmark                   avg (min … max) p75 / p99    (min … top 1%)
------------------------------------------- -------------------------------
• skk.moe
------------------------------------------- -------------------------------
hexo-util (old)              712.23 µs/iter 707.13 µs  ▃█
                      (639.04 µs … 2.73 ms)   1.15 ms  ██▃
                    (  1.09 mb …   1.24 mb)   1.19 mb ▆███▅▂▁▁▁▁▁▁▁▁▂▁▂▁▁▁▁
                  4.79 ipc (  2.19% stalls)  98.94% L1 data cache
          2.43M cycles  11.63M instructions  35.83% retired LD/ST (  4.17M)

hexo-util (new)              303.18 µs/iter 305.17 µs ▂█▃
                    (278.54 µs … 563.63 µs) 502.21 µs ███▂
                    (396.88 kb … 872.90 kb) 568.87 kb ████▅▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
                  4.45 ipc (  1.18% stalls)  98.66% L1 data cache
          1.05M cycles   4.65M instructions  19.20% retired LD/ST (893.48k)

fast-escape-html             226.87 µs/iter 227.71 µs ▄█▃
                      (208.75 µs … 1.03 ms) 375.92 µs ███
                    (484.91 kb … 532.95 kb) 484.95 kb ████▅▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
                  4.61 ipc (  0.74% stalls)  99.53% L1 data cache
        777.10k cycles   3.58M instructions  21.25% retired LD/ST (761.79k)

escape-html                  228.08 µs/iter 230.00 µs ▃█
                    (211.75 µs … 537.58 µs) 366.29 µs ███
                    ( 75.42 kb … 769.93 kb) 484.94 kb ████▅▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁
                  4.57 ipc (  0.77% stalls)  99.45% L1 data cache
        789.15k cycles   3.60M instructions  21.07% retired LD/ST (759.37k)

@napi-rs/escape              249.93 µs/iter 251.88 µs  █
                      (226.96 µs … 1.08 ms) 566.04 µs ▃█
                    (207.75 kb … 207.77 kb) 207.77 kb ███▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
                  4.50 ipc ( 17.16% stalls)  83.34% L1 data cache
        863.18k cycles   3.88M instructions  23.64% retired LD/ST (917.79k)

html-escaper                 382.64 µs/iter 380.04 µs  █
                      (347.13 µs … 1.68 ms) 842.04 µs ▄█
                    (540.52 kb … 576.24 kb) 574.07 kb ██▇▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
                  4.21 ipc (  2.96% stalls)  97.82% L1 data cache
          1.31M cycles   5.51M instructions  34.04% retired LD/ST (  1.88M)

lodash.escape                391.07 µs/iter 391.00 µs  █
                      (353.50 µs … 1.16 ms) 862.46 µs  █
                    (540.52 kb … 576.36 kb) 574.15 kb ███▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
                  4.18 ipc (  2.89% stalls)  97.87% L1 data cache
          1.35M cycles   5.64M instructions  34.12% retired LD/ST (  1.92M)

escape-goat                  513.57 µs/iter 503.92 µs  █
                      (452.79 µs … 1.38 ms)   1.12 ms  █
                    (  1.22 mb …   1.22 mb)   1.22 mb ███▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
                  3.53 ipc (  2.86% stalls)  97.52% L1 data cache
          1.76M cycles   6.21M instructions  30.85% retired LD/ST (  1.92M)

• github.com (incognito)
------------------------------------------- -------------------------------
hexo-util (old)                2.00 ms/iter   1.94 ms  █▅
                        (1.77 ms … 4.01 ms)   3.04 ms  ██▂
                    (  2.90 mb …   2.97 mb)   2.90 mb ▇███▂▁▁▁▁▁▁▁▁▁▁▁▂▂▄▂▂
                  4.59 ipc (  1.96% stalls)  99.02% L1 data cache
          6.74M cycles  30.95M instructions  35.47% retired LD/ST ( 10.98M)

hexo-util (new)              981.66 µs/iter   1.00 ms  ▄█▇▆
                      (915.50 µs … 1.37 ms)   1.20 ms ▂████▅▂
                    (  1.20 mb …   1.49 mb)   1.20 mb ████████▇▄▂▃▄▄▂▃▁▂▁▁▁
                  4.30 ipc (  0.91% stalls)  98.74% L1 data cache
          3.36M cycles  14.41M instructions  17.00% retired LD/ST (  2.45M)

fast-escape-html             762.76 µs/iter 765.58 µs   █▅
                      (708.46 µs … 1.06 ms)   1.01 ms  ███▆
                    (  1.19 mb …   1.19 mb)   1.19 mb █████▆▂▂▁▁▁▁▁▁▂▂▃▂▁▂▁
                  4.58 ipc (  0.89% stalls)  99.17% L1 data cache
          2.63M cycles  12.06M instructions  20.31% retired LD/ST (  2.45M)

escape-html                  771.41 µs/iter 772.92 µs  ▅█▅
                      (717.83 µs … 2.70 ms)   1.01 ms ▄███▄
                    (  1.19 mb …   1.19 mb)   1.19 mb █████▇▃▂▁▁▁▁▁▁▂▂▂▂▂▂▁
                  4.55 ipc (  0.87% stalls)  99.20% L1 data cache
          2.65M cycles  12.07M instructions  20.17% retired LD/ST (  2.44M)

@napi-rs/escape              886.43 µs/iter 903.08 µs    ▃█▇
                      (799.21 µs … 1.40 ms)   1.16 ms   ▇███▇▂
                    (697.81 kb … 697.81 kb) 697.81 kb ▂▆██████▄▄▃▂▁▂▁▁▁▁▁▁▁
                  4.64 ipc ( 18.92% stalls)  82.14% L1 data cache
          3.01M cycles  13.97M instructions  23.41% retired LD/ST (  3.27M)

html-escaper                   1.13 ms/iter   1.14 ms  ▆▇█▄
                        (1.04 ms … 1.64 ms)   1.52 ms  ████
                    (  1.37 mb …   1.43 mb)   1.39 mb ▇█████▃▂▂▂▃▂▂▂▁▂▂▂▂▂▁
                  3.98 ipc (  2.54% stalls)  98.00% L1 data cache
          3.88M cycles  15.44M instructions  33.38% retired LD/ST (  5.15M)

lodash.escape                  1.15 ms/iter   1.15 ms   ▂█
                        (1.04 ms … 1.88 ms)   1.57 ms  ███▅
                    (  1.37 mb …   1.43 mb)   1.39 mb ▃████▇▃▂▁▁▁▁▁▁▁▂▂▂▂▃▁
                  3.98 ipc (  2.49% stalls)  98.05% L1 data cache
          3.94M cycles  15.66M instructions  33.62% retired LD/ST (  5.26M)

escape-goat                    1.35 ms/iter   1.33 ms   █
                        (1.17 ms … 2.15 ms)   2.00 ms  ▅██
                    (  3.82 mb …   3.82 mb)   3.82 mb ▂████▂▁▁▁▁▂▂▂▂▂▂▃▂▂▂▁
                  3.51 ipc (  3.34% stalls)  97.22% L1 data cache
          4.64M cycles  16.31M instructions  30.47% retired LD/ST (  4.97M)

• stackoverflow.com (incognito)
------------------------------------------- -------------------------------
hexo-util (old)                2.12 ms/iter   2.12 ms   █▆
                        (1.93 ms … 3.43 ms)   2.96 ms  ▅██▂
                    (  3.35 mb …   3.46 mb)   3.46 mb ▅████▃▁▁▁▁▂▃▃▂▂▂▁▁▁▁▁
                  4.45 ipc (  2.45% stalls)  98.75% L1 data cache
          7.22M cycles  32.18M instructions  35.53% retired LD/ST ( 11.43M)

hexo-util (new)              849.81 µs/iter 843.25 µs  ▂█
                      (783.04 µs … 1.26 ms)   1.19 ms  ██▆
                    (  1.00 mb …   1.44 mb)   1.39 mb ████▆▂▁▁▁▁▁▁▁▁▁▂▂▃▂▁▁
                  3.85 ipc (  0.65% stalls)  99.44% L1 data cache
          2.93M cycles  11.31M instructions  19.18% retired LD/ST (  2.17M)

fast-escape-html             668.95 µs/iter 664.88 µs  ▇█
                      (611.96 µs … 1.29 ms) 975.38 µs ▃███
                    (  1.18 mb …   1.18 mb)   1.18 mb ████▆▂▁▁▁▁▁▁▁▁▁▂▂▂▂▂▁
                  4.43 ipc (  0.78% stalls)  99.43% L1 data cache
          2.29M cycles  10.16M instructions  21.21% retired LD/ST (  2.16M)

escape-html                  683.08 µs/iter 673.71 µs  █▆
                      (622.29 µs … 2.98 ms)   1.00 ms ▃██▅
                    (  1.18 mb …   1.18 mb)   1.18 mb ████▅▂▁▁▁▁▁▁▁▁▂▂▂▂▂▁▁
                  4.35 ipc (  0.73% stalls)  99.48% L1 data cache
          2.34M cycles  10.18M instructions  21.03% retired LD/ST (  2.14M)

@napi-rs/escape              749.64 µs/iter 766.83 µs       ▇█
                      (666.42 µs … 1.36 ms) 903.63 µs     ▄████▇▃
                    (586.23 kb … 586.23 kb) 586.23 kb ▂▄▅████████▅▃▂▁▁▁▁▁▁▁
                  4.39 ipc ( 17.59% stalls)  82.46% L1 data cache
          2.53M cycles  11.11M instructions  23.38% retired LD/ST (  2.60M)

html-escaper                   1.11 ms/iter   1.12 ms   ▇█
                        (1.01 ms … 1.59 ms)   1.51 ms  ████
                    (  1.26 mb …   1.32 mb)   1.28 mb ▃█████▄▂▂▃▃▂▂▂▂▂▁▂▂▁▁
                  3.83 ipc (  2.44% stalls)  98.05% L1 data cache
          3.79M cycles  14.50M instructions  33.52% retired LD/ST (  4.86M)

lodash.escape                  1.12 ms/iter   1.12 ms  ▃█▃
                        (1.02 ms … 1.87 ms)   1.62 ms  ███▃
                    (  1.26 mb …   1.32 mb)   1.28 mb ▃████▄▂▁▁▁▁▁▁▂▂▂▂▁▁▁▁
                  3.85 ipc (  2.39% stalls)  98.12% L1 data cache
          3.83M cycles  14.74M instructions  33.74% retired LD/ST (  4.97M)

escape-goat                    1.30 ms/iter   1.26 ms  ▅█
                        (1.11 ms … 2.31 ms)   2.10 ms  ██▃
                    (  3.30 mb …   3.30 mb)   3.30 mb ▄███▅▂▁▁▁▁▁▁▁▂▂▂▂▂▂▂▁
                  3.45 ipc (  2.75% stalls)  97.68% L1 data cache
          4.42M cycles  15.25M instructions  31.28% retired LD/ST (  4.77M)

• www.google.com (incognito)
------------------------------------------- -------------------------------
hexo-util (old)                1.05 ms/iter   1.04 ms  ▇█
                      (945.83 µs … 1.88 ms)   1.80 ms  ██
                    (  1.75 mb …   1.83 mb)   1.76 mb ▅██▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▂
                  4.38 ipc (  2.74% stalls)  98.44% L1 data cache
          3.59M cycles  15.73M instructions  35.03% retired LD/ST (  5.51M)

hexo-util (new)              627.14 µs/iter 625.46 µs  █▄
                      (579.17 µs … 1.42 ms) 993.42 µs ▆██
                    (676.91 kb … 676.97 kb) 676.93 kb ████▂▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▁
                  3.44 ipc (  0.75% stalls)  98.67% L1 data cache
          2.16M cycles   7.44M instructions  17.23% retired LD/ST (  1.28M)

fast-escape-html             397.64 µs/iter 401.04 µs  █▇
                    (370.92 µs … 743.79 µs) 596.79 µs ███▆
                    (513.24 kb … 513.29 kb) 513.27 kb ████▆▂▂▁▁▁▁▁▁▁▁▁▁▂▂▁▁
                  4.76 ipc (  0.85% stalls)  99.01% L1 data cache
          1.37M cycles   6.53M instructions  19.20% retired LD/ST (  1.25M)

escape-html                  400.46 µs/iter 403.54 µs  ▆█
                    (375.17 µs … 901.50 µs) 591.17 µs ███▅
                    (513.24 kb … 513.29 kb) 513.27 kb ████▆▃▂▁▁▁▁▁▁▁▁▁▁▁▁▂▁
                  4.74 ipc (  0.84% stalls)  99.02% L1 data cache
          1.38M cycles   6.53M instructions  19.10% retired LD/ST (  1.25M)

@napi-rs/escape              443.95 µs/iter 464.38 µs  ▄█▂
                      (390.88 µs … 1.42 ms) 658.25 µs  ███▅▇▇
                    (390.05 kb … 390.05 kb) 390.05 kb ▇███████▄▃▂▁▁▁▁▁▁▁▁▁▁
                  4.58 ipc ( 16.59% stalls)  82.85% L1 data cache
          1.51M cycles   6.91M instructions  22.02% retired LD/ST (  1.52M)

html-escaper                 511.33 µs/iter 512.13 µs  ▄█▂
                      (458.33 µs … 1.10 ms) 779.25 µs  ███
                    (732.50 kb … 761.35 kb) 758.76 kb ▄████▅▂▂▁▁▂▁▂▂▂▂▂▁▁▁▁
                  4.05 ipc (  3.08% stalls)  97.58% L1 data cache
          1.74M cycles   7.06M instructions  33.19% retired LD/ST (  2.34M)

lodash.escape                515.14 µs/iter 514.71 µs  █
                      (465.71 µs … 1.19 ms) 952.42 µs  ██
                    (733.70 kb … 825.25 kb) 759.80 kb ▇██▅▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
                  4.08 ipc (  3.00% stalls)  97.68% L1 data cache
          1.76M cycles   7.18M instructions  33.41% retired LD/ST (  2.40M)

escape-goat                  660.67 µs/iter 652.08 µs  █▇
                      (563.75 µs … 2.13 ms)   1.20 ms  ██▅
                    (  1.84 mb …   2.10 mb)   1.97 mb ▄███▅▁▁▁▁▁▂▂▂▂▁▂▂▁▁▁▁
                  3.30 ipc (  4.70% stalls)  95.00% L1 data cache
          2.26M cycles   7.46M instructions  29.01% retired LD/ST (  2.16M)

• about.gitlab.com
------------------------------------------- -------------------------------
hexo-util (old)                1.79 ms/iter   1.73 ms  ▄█
                        (1.59 ms … 2.91 ms)   2.81 ms  ██
                    (  2.76 mb …   2.80 mb)   2.80 mb ▃██▆▁▁▁▁▁▁▁▁▁▁▁▁▁▃▂▂▁
                  4.67 ipc (  2.30% stalls)  98.86% L1 data cache
          6.00M cycles  28.05M instructions  35.44% retired LD/ST (  9.94M)

hexo-util (new)              984.53 µs/iter 984.50 µs  ▄█▄
                      (908.79 µs … 1.60 ms)   1.35 ms  ███▂
                    (  1.09 mb …   1.09 mb)   1.09 mb ▆████▄▂▁▁▁▁▁▁▁▁▂▂▃▂▁▁
                  4.00 ipc (  0.79% stalls)  98.71% L1 data cache
          3.38M cycles  13.54M instructions  16.85% retired LD/ST (  2.28M)

fast-escape-html             745.81 µs/iter 745.46 µs  ▅█
                      (693.33 µs … 1.27 ms)   1.08 ms ▇██▇
                    (917.91 kb … 917.98 kb) 917.93 kb ████▅▂▁▁▁▁▁▁▁▁▁▁▂▂▂▂▁
                  4.80 ipc (  0.74% stalls)  99.27% L1 data cache
          2.57M cycles  12.31M instructions  19.35% retired LD/ST (  2.38M)

escape-html                  757.80 µs/iter 757.46 µs  ▇█
                      (701.75 µs … 1.18 ms)   1.10 ms ▄██▇
                    (917.91 kb … 917.98 kb) 917.93 kb ████▆▂▁▁▁▁▁▁▁▁▁▁▂▂▂▂▁
                  4.74 ipc (  0.71% stalls)  99.31% L1 data cache
          2.60M cycles  12.33M instructions  19.25% retired LD/ST (  2.37M)

@napi-rs/escape              881.45 µs/iter 895.46 µs    █▆
                      (791.96 µs … 2.35 ms)   1.20 ms   ▇███▃
                    (732.54 kb … 732.54 kb) 732.54 kb ▄▆█████▄▃▂▂▁▁▁▁▁▁▁▁▁▁
                  4.95 ipc ( 22.49% stalls)  79.85% L1 data cache
          2.98M cycles  14.72M instructions  23.03% retired LD/ST (  3.39M)

html-escaper                 953.99 µs/iter 957.13 µs  ▆█▂
                      (843.92 µs … 1.75 ms)   1.48 ms  ███
                    (  1.36 mb …   1.42 mb)   1.37 mb ▄████▄▂▃▃▂▂▂▁▂▂▁▁▁▁▁▁
                  4.04 ipc (  3.39% stalls)  97.46% L1 data cache
          3.23M cycles  13.06M instructions  32.83% retired LD/ST (  4.29M)

lodash.escape                944.43 µs/iter 941.96 µs   █▃
                      (856.21 µs … 1.48 ms)   1.35 ms  ███▂
                    (  1.36 mb …   1.42 mb)   1.37 mb ▄████▃▂▁▁▁▁▁▁▁▂▂▂▂▂▁▁
                  4.09 ipc (  3.38% stalls)  97.53% L1 data cache
          3.23M cycles  13.21M instructions  33.10% retired LD/ST (  4.37M)

escape-goat                    1.13 ms/iter   1.15 ms  ▂██
                      (995.13 µs … 3.06 ms)   1.54 ms  ████▃
                    (  3.65 mb …   3.65 mb)   3.65 mb ▄██████▄▃▃▃▅▄▃▂▃▂▂▁▁▁
                  3.43 ipc (  4.99% stalls)  95.23% L1 data cache
          3.84M cycles  13.17M instructions  29.18% retired LD/ST (  3.84M)

@coveralls
Copy link

coveralls commented May 31, 2025

Coverage Status

coverage: 97.09% (+0.2%) from 96.875%
when pulling dfcfcdc on SukkaW:escape-html
into d497bc7 on hexojs:master.

@D-Sketon
Copy link
Member

before PR

escapeHTML('&0') // => &0

after PR

escapeHTML('&0') // => &0

Is this intentional?

@SukkaW
Copy link
Member Author

SukkaW commented Jun 1, 2025

Is this intentional?

Hmmmm, extra check again.

@SukkaW SukkaW marked this pull request as draft June 1, 2025 08:37
@SukkaW SukkaW marked this pull request as ready for review June 14, 2025 08:43
@SukkaW
Copy link
Member Author

SukkaW commented Jun 14, 2025

@D-Sketon Check again? Added extra check for ;.

@D-Sketon
Copy link
Member

D-Sketon commented Jun 14, 2025

consider malformed entity patterns
before PR

escapeHTML('�') // => �   (#\d{1,7} ×)
escapeHTML('�') // => �   (#[Xx][a-fA-F0-9]{1,6} ×)

after PR

escapeHTML('�') // => �
escapeHTML('�') // => �

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants