Skip to content

Deduplicate VM constants #712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

triallax
Copy link
Contributor

@triallax triallax commented Apr 7, 2025

(except when defining derived unit constants, as initially identical but temporary values)

Don't see any real performance changes from this (if anything it's probably a minor slowdown due to the added dupe check in add_constant). However, it does reduce the number of constants generated from the prelude from ~730 to ~320, a reduction of 56%. This in theory allows for larger Numbat programs as it's now harder to hit the ~65k constants ceiling. Worth it? I don't know :)

(except when defining derived unit constants, as initially identical but
temporary values)

Don't see any real performance changes from this (if anything it's
probably a minor slowdown due to the added dupe check in
`add_constant`). However, it does reduce the number of constants
generated from the prelude from ~730 to ~320, a reduction of 56%. This
in theory allows for larger Numbat programs as it's now harder to hit
the ~65k constants ceiling. Worth it? I don't know :)
@Goju-Ryu
Copy link
Contributor

Goju-Ryu commented Apr 8, 2025

This looks great.
I’m wondering if using an IndexSet would be possible? If it is I think it could simplify the code a bit and possibly address the (potential) performance impact you mentioned.
I’m not familiar enough with it to know its limitations, so this is mostly just my musings of curiosity.

@triallax
Copy link
Contributor Author

triallax commented Apr 8, 2025

i've already thought of that, unfortunately sets would not work here due to the dummy temporary constants added for derived base units, which are identical until processed at vm runtime; with a set these dummies would be incorrectly duplicated, at least as the code is now

@triallax
Copy link
Contributor Author

triallax commented Apr 8, 2025

the performance impact is nonexistent with my simple testing but i suppose algorithmically speaking the performance wil worsen much faster the more constants you have, so perhaps we can consider somehow eliminating the need for dummies here, it would also solve the pre-existing todo

@Goju-Ryu
Copy link
Contributor

Goju-Ryu commented Apr 8, 2025

Aah, that's too bad. Thank you for the explanation though. I’m glad the impact is negligible, at currently realistic scale. With a linear time complexity, I think the trade off is worth it even at larger scales just due to the current constraint on the number of constants. It would be interesting to test the effects at scales large enough to approach the maximum number of constants without the duplication check.

@triallax
Copy link
Contributor Author

triallax commented Apr 8, 2025

to be clear i meant my previous testing (basically only the prelude)

however now i'm curious to get some results on more pathological cases so i did these rudimentary benchmarks:

$ seq 1 10000 | awk '{ print "let _" $0 " = " $0 }' > f
$ hyperfine " ./src/numbat/numbat-old  f"
Benchmark 1:  ./src/numbat/numbat-old  f
  Time (mean ± σ):      3.149 s ±  0.029 s    [User: 3.128 s, System: 0.018 s]
  Range (min … max):    3.100 s …  3.209 s    10 runs
 
$ hyperfine " ./src/numbat/numbat-better  f"
Benchmark 1:  ./src/numbat/numbat-better  f
  Time (mean ± σ):      3.219 s ±  0.029 s    [User: 3.200 s, System: 0.015 s]
  Range (min … max):    3.182 s …  3.267 s    10 runs
$ # same test but strings
$ seq 1 10000 | awk '{ print "let _" $0 " = \"foo" $0 "\"" }' > f
$ hyperfine " ./src/numbat/numbat-old  f" "./src/numbat/numbat-better f"
Benchmark 1:  ./src/numbat/numbat-old  f
  Time (mean ± σ):      2.008 s ±  0.027 s    [User: 1.991 s, System: 0.015 s]
  Range (min … max):    1.976 s …  2.066 s    10 runs
 
Benchmark 2: ./src/numbat/numbat-better f
  Time (mean ± σ):      2.332 s ±  0.033 s    [User: 2.314 s, System: 0.017 s]
  Range (min … max):    2.296 s …  2.401 s    10 runs
 
Summary
   ./src/numbat/numbat-old  f ran
    1.16 ± 0.02 times faster than ./src/numbat/numbat-better f

so there's a slowdown for sure (more noticeable on the strings version unsurprisingly), but it's not very massive and on a pathological case, but we might be able to do some work here still. still, i'm not sure if this pull request is worth it in the first place, i'm not sure what practical benefits it actually brings

also, i'm surprised the strings version is faster in the first place! i wonder why that is

@triallax
Copy link
Contributor Author

triallax commented Apr 8, 2025

also, tangential side note: piping the awk output directly into numbat is slower by an order of magnitude than reading it from a file:

$ time seq 1 10000 | awk '{ print "let _" $0 " = " $0 }' | numbat

________________________________________________________
Executed in   21.75 secs    fish           external
   usr time   21.66 secs    1.48 millis   21.66 secs
   sys time    0.09 secs    0.00 millis    0.09 secs

definitely worth a look

@Goju-Ryu
Copy link
Contributor

Goju-Ryu commented Apr 9, 2025

That is a pretty minor impact at that scale, especially considering that the tests specifically avoid anything but the worst case scenario.
In a real life scenario, if we assume that the rate of occurrence of individual constant values are similar to the prelude (~40% being numbers 0-9), we would see even lower impact due to finding these values in the beginning of the list instead of having to traverse the whole thing on every constant.

also, i'm surprised the strings version is faster in the first place! i wonder why that is

That one surprised me too. I wonder if choosing representation and handed of units (or the lack there of) is having an impact here?

@triallax
Copy link
Contributor Author

triallax commented Apr 9, 2025

That one surprised me too. I wonder if choosing representation and handed of units (or the lack there of) is having an impact here?

my guess is possibly relating to number parsing, but i'm not sure if that can account for a 1 second difference
i'll take a closer look once i have the time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants