You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Array of ints, two ints per lookup entry. For a 1024-long name lookup, this would be 8KB of memory. That doesn't sound bad, but it's pretty frequently accessed (every time we encode an IRI). Using short would restrict the size of the array to 32k entries (which is fine by me, I mean, why would you want such a huge lookup anyway...), but it would halve the memory footprint to 4KB. This could be significant, helping the structure fit in L1 cache.
DependentNode currently has one reference field and 4 ints:
That's (on most common JVMs) 12B for the header, 4B for reference, 16B for the ints. 32B total. If you shrunk the lookupPointer fields to short, it'd cut 4B, which would not reduce the object size due to 64bit alignment (unless you are using compact object headers from Java 24!). To get it down by 8B, you'd also need to fit serials in short.
Maybe the serials could also be changed to use shorts instead of ints. I'd have to carefully go on paper through possible scenarios where it could break.
The text was updated successfully, but these errors were encountered:
For LookupEntry the size is currently 12 + 4 + 4 +1 = 21B, rounded up to 24B.
With these changes, it would be 12 + 2 + 2 + 1 = 17B, which still is 24B.
Unless you are using COHs from Java 24, then it does fit in 16B.
I'm wondering if we can get rid of newEntry. This can be done with virtual function calls, but that's "meh". We could do a shared public field in EncoderLookup, though. ;)
I tried it, and while it works fine, it's also 5% slower. The likely reason is that JVM simply doesn't have bytecodes for short instructions and the hardware (in my case: Ryzen CPU) also isn't optimized for 16bit math. Any savings from better cache utilization are negated by increase in ALU demand.
This is quite interesting, though. It seems that the encoder's fixed structures are small enough to not make a sensible difference, and in dense RDF data it's more valuable to have fewer processing steps. Food for thought.
In the encoder lookup, we have the
table
field:jelly-jvm/core/src/main/java/eu/ostrzyciel/jelly/core/internal/EncoderLookup.java
Line 43 in 6612c97
Array of ints, two ints per lookup entry. For a 1024-long name lookup, this would be 8KB of memory. That doesn't sound bad, but it's pretty frequently accessed (every time we encode an IRI). Using
short
would restrict the size of the array to 32k entries (which is fine by me, I mean, why would you want such a huge lookup anyway...), but it would halve the memory footprint to 4KB. This could be significant, helping the structure fit in L1 cache.DependentNode
currently has one reference field and 4 ints:jelly-jvm/core/src/main/java/eu/ostrzyciel/jelly/core/internal/NodeEncoderImpl.java
Line 20 in 6612c97
That's (on most common JVMs) 12B for the header, 4B for reference, 16B for the ints. 32B total. If you shrunk the
lookupPointer
fields toshort
, it'd cut 4B, which would not reduce the object size due to 64bit alignment (unless you are using compact object headers from Java 24!). To get it down by 8B, you'd also need to fit serials inshort
.Maybe the serials could also be changed to use shorts instead of ints. I'd have to carefully go on paper through possible scenarios where it could break.
The text was updated successfully, but these errors were encountered: