Skip to content

Measure performance improvement by "fast double parsing" #970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cowtowncoder opened this issue Apr 2, 2023 · 10 comments
Closed

Measure performance improvement by "fast double parsing" #970

cowtowncoder opened this issue Apr 2, 2023 · 10 comments
Labels
performance Issue related to performance problems or enhancements

Comments

@cowtowncoder
Copy link
Member

Jackson 2.14 added StreamReadFeature.USE_FAST_DOUBLE_PARSER and backing implementation to speed up decoding of floating-point values from JSON (see #577). But we haven't published results (or actually done much in way of measurements).
Let's change that.

@cowtowncoder
Copy link
Member Author

Preliminary results suggests similar performance characteristics between Jackson 2.14 and 2.15(.0-rc2):

  • with Java 8 there is speed up of 10-15%
  • with Java 17 there is negligible speed up of 1-2%

this is bit disappointing -- although there may be data for which benefit would be larger (arrays of floating-point values; bigger magnitude and/or longer fractional parts?).
But at least jackson-benchmarks has support for testing performance now.

@wrandelshofer
Copy link

wrandelshofer commented Apr 4, 2023

Would it be possible to have separate JMH tests of ReaderBasedJsonParser.nextFieldName() and of DoubleParser.deserialize()?

When I look at the test runs of JsonStringReadVanilla.readCurrencyPojoFast() in VisualVM, I see that half of the time is spent in each method.
It is not clear to me, which one of these causes the slow-down on Java 17.

Because, when I just measure FastDoubleParser on Java 8 and on Java 17, I see the same performance speedup on both JVMs.

com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap() 93.3%
com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextFieldName() 46.4%
com.fasterxml.jackson.databind.deser.std.NumberDeserializers$DoubleDeserializer.deserialize() 36.6%

@cowtowncoder
Copy link
Member Author

@wrandelshofer Ah. one quick note: please dis-regard JsonStringReadVanilla -- it tests reading from String input instead of byte-backed source.

Still, that should not cause major issues. Method in question is for matching key (JSON Object) for Map target. And why it should mess optimization wrt JDK 17 is pretty odd.

On one hand it'd be good to also have more FP-heavy data (arrays). On the other hand it's sort of impressive how reading of FP values is relatively not that much heavier than String decoding.

@cowtowncoder
Copy link
Member Author

And yes, it'd be good to have more targeted test(s). I was hoping to use something close to "real world" to give users some idea (I have half-written blog about this case). But there are definitely different uses for tests too, including verifying specific low-level decoding performance.

@wrandelshofer
Copy link

The benchmark results of JsonStringReadVanilla are really unexpected.
It has less throughput on Java 17 than on Java 8 on my computer (Intel Mac Mini 2018).

I tried with the "CompactString"-Feature disabled/enabled and with/without an Inline-Hint for the JIT.
But it did not have an effect on the performance.

@Fork(jvmArgsAppend= {
       // "-XX:+UnlockDiagnosticVMOptions","-XX:+PrintInlining",
        "-XX:CompileCommand=inline,java/lang/String.charAt",
        "-XX:CompileCommand=inline,java/lang/CharSequence.charAt",
        "-XX:-CompactStrings"
         }
)

@cowtowncoder
Copy link
Member Author

@wrandelshofer that is odd indeed.

@cowtowncoder
Copy link
Member Author

Closing since I have created the benchmark & published preliminary results via:

https://cowtowncoder.medium.com/jackson-2-15-yet-faster-floating-point-reads-fd6d5a0b769a

so future work can be under different GH issues.

@pjfanning
Copy link
Member

It might be worth testing with the java command line flag in https://github.com/wrandelshofer/FastDoubleParser#performance-tuning

A further set of optimisations that could be consider include not using Strings to cache numbers in the JsonParser implementations and use char arrays instead. FastDoubleParser supports parsing char arrays. When using JDK number parsing, we would need to create Strings then (because JDK number parsing generally doesn't support parsing char arrays).

One thing to watch out for is that jackson-core recycles char arrays so when we cache numbers as char arrays, we will need to clone the array first. This char array caching might speed up fast parsing path but at the risk of slowing down the main path.

See one reason that Strings may be a bit of a problem:
https://github.com/wrandelshofer/FastDoubleParser#performance-tuning

@cowtowncoder
Copy link
Member Author

Right. In most cases we can share a reference to number decoded in char[] used internally, but not always (buffer boundaries). Part of the challenge may just be that due to earlier use of just JDK, no attempt was made to avoid String construction for FP/Big-numbers.

But there is also the real problem of having to defer some parsing; for example, pretty much all Kotlin/Scala usage is via Object types that use Constructor; this requires buffering, and buffering with char[] and offsets gets gnarly (and negates any benefits over String).

I am not 100% sure how much overhead Strings really add in context of overall performance gains (as opposed to raw numbers for decoding from different sources), but definitely it could be one remaining area of interest for optimizations.

@cowtowncoder
Copy link
Member Author

Looking at profiler output, it does look like about 33% of time is spent in decoding (from String to double), and another 15% decoding JSON number token itself. But that 33% seems to be spent both on sun.misc.FloatingDecimal.parseDouble (un-optimized) and AbstractJavaFloatingPointBitsFromCharSequence.parseFloatingPointLiteral (FDP) -- although I guess profiler could skew timings.

Unfortunately looking at handling of _numberString, I think changes we made in 2.15 for better accuracy (wrt avoiding conversions b/w double, BigDecimal), deferring processing, made it much more difficult (not impossible but impractical) to avoid String allocation.

As to command-line flag I am not sure: unless users use same settings, not sure there's point in improving benchmark numbers themselves; it seems most valuable to see kinds of numbers users are likely to see.
Although I guess it does make sense to at least see if that makes big difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issue related to performance problems or enhancements
Projects
None yet
Development

No branches or pull requests

3 participants