Skip to content

add NonBlockingByteBufferJsonParser (and refactor some NonBlockingJsonParser code to make it more extensible) #795

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Jul 25, 2022

Conversation

pjfanning
Copy link
Member

@pjfanning pjfanning commented Jul 12, 2022

Relates to #478

  • make it feasible to have a separate ByteBuffer based parser - it's a pity to see NonBlockingJsonParser implements ByteBufferFeeder #483 die on the vine - but I would favour having separate byte[] and ByteBuffer parsers to avoid the overhead of wrapping the byte[] in buffers
  • aim would be to move on most NonBlockingJsonParser code to a superclass and that a new NonBlockingByteBufferJsonParser would subclass this superclass and just override some methods like the new getNextByteFromBuffer and getByteFromBuffer methods (would NonBlockingJsonParserBase be ok for this or would a new intermediate class be better?)
  • I have run some perf tests and the refactored NonBlockingJsonParser seems to have almost exactly the same perf as before (so far, it seems a tad faster)

@pjfanning pjfanning marked this pull request as draft July 12, 2022 16:22
@cowtowncoder
Copy link
Member

Yeah, I am specifically worried about performance aspects since byte access is, well, once (or more) per input byte.
If this can be shown to have only marginal effect with modern JVMs (say, <10%) I am ok with it.
I don't know if there are any specific tricks we can use to help JVM speculatively inline access for single implementation type.

Alternatively if we could find out hot spots -- maybe a small portion of accesses could be left for concrete classes to implement -- we could perhaps have hybrid model to balance need to re-implement only a small set of methods (with hard-coded access). I haven't checked code to see if that is at all feasible, will have a look next.

It definitely would be nice to support ByteBuffer, as well as potentially other library/framework-provided byte-buffer/byte-source abstractions.

@pjfanning
Copy link
Member Author

@cowtowncoder this still far from complete and needs testing and more specifically, perf testing.

I have pushed on and created NonBlockingUtf8JsonParserBase which has most of the code from NonBlockingJsonParser. NonBlockingByteBufferJsonParser extends NonBlockingUtf8JsonParserBase.

The idea is to have NonBlockingJsonParser also extend NonBlockingUtf8JsonParserBase but I've left that for a few days so that it is visible how much code has changed in NonBlockingJsonParser in order to not have the methods work directly on the _inputBuffer (and to use the getters instead).

@pjfanning
Copy link
Member Author

pjfanning commented Jul 14, 2022

@cowtowncoder does this look a reasonable perf test?

https://github.com/pjfanning/jackson-number-parse-bench/blob/main/src/jmh/java/org/example/jackson/bench/AsyncParserBench.java

I need to try it with more permutations but on Zulu JDK 1.8.0_332, I get these results:

v2.14 without refactor:
AsyncParserBench.bigArrayOfFloatPrimitives thrpt 5 26115.264 ± 626.839 ops/s

v2.14 with refactor:
AsyncParserBench.bigArrayOfFloatPrimitives thrpt 5 26964.013 ± 201.662 ops/s

v2.13.3:
AsyncParserBench.bigArrayOfFloatPrimitives thrpt 5 25964.235 ± 515.105 ops/s

So far, it looks like the refactor has not has a noticeable impact on performance - at least in this test case.

@cowtowncoder
Copy link
Member

@pjfanning Thanks, I need to have a look. It sounds legit, although ideally test would exercise different token types and access content. But then again, there are so many different aspects (in most tests the whole content is fed as a big chunk, vs. smaller pieces).

But I think at least we can rule out obvious drastic negative effects.

Fwtw, jackson-benchmarks:

https://github.com/FasterXML/jackson-benchmarks/

also has non-blocking/async test variants so maybe I could find time to test it as well.

@pjfanning
Copy link
Member Author

Thanks for pointing me towards jackson-benchmarks. I ran one of the existing benchmarks and got these numbers (Zulu JDK 1.8.0_332):

2.13.3
Benchmark Mode Cnt Score Error Units
JsonStdReadAsync.readPojoMediaItem thrpt 8 383982.674 ± 5241.035 ops/s

2.14 without refactor
Benchmark Mode Cnt Score Error Units
JsonStdReadAsync.readPojoMediaItem thrpt 8 386167.559 ± 6744.119 ops/s

2.14 with refactor
Benchmark Mode Cnt Score Error Units
JsonStdReadAsync.readPojoMediaItem thrpt 8 402986.292 ± 5160.014 ops/s

@cowtowncoder
Copy link
Member

Ah ha! Well now that is interesting. :)

I think I am fine proceeding with this PR then; perhaps we could get ByteBuffer backed version in 2.14?

@cowtowncoder
Copy link
Member

@pjfanning As per above, just let me know when you think this should be ready and I can review with more thought (it all looks good so far!) and get it merged.

Not sure how easy it is to extend existing byte[]-backed tests to also do variant on ByteBuffer but if that's doable that'd be great. I understand that with relevant code shared there shouldn't be much difference but it'd still be nice to double-check as long as there isn't tons of test code duplication.

@pjfanning pjfanning marked this pull request as ready for review July 15, 2022 16:41
@pjfanning pjfanning changed the title refactor some NonBlockingJsonParser code to make it more extensible add NonBlockingByteBufferJsonParser (and refactor some NonBlockingJsonParser code to make it more extensible) Jul 15, 2022
@pjfanning
Copy link
Member Author

pjfanning commented Jul 15, 2022

@cowtowncoder this PR has the byte buffer code too - see https://github.com/FasterXML/jackson-core/pull/795/files#diff-1cce0198eb8bfe42fcfac3e075d09d858c2bf8b198b777297f943ab4f1731361

I also added createNonBlockingByteBufferParser() to JsonFactory.

I took a few of the byte[] array tests and made them run with the byte buffer too.

@cowtowncoder
Copy link
Member

I will try to get this reviewed and merged soon (this week has been hectic but next week I'll have much more time for Jackson again!)

@cowtowncoder
Copy link
Member

@pjfanning Ok, I probably should have merged PRs in different order, but looks like there's a conflict. Could you resolve that? I'll merge this ASAP when that is done.

@pjfanning
Copy link
Member Author

@cowtowncoder thanks for merging the other PRs - the async number parsing test one was always going to conflict with this - I think I've got those changes merged in here now to fix the conflict.

@cowtowncoder
Copy link
Member

Thank you @pjfanning ! Yeah, minor conflicts are to be expected. Will now merge.

@cowtowncoder cowtowncoder merged commit 6f6a30a into FasterXML:2.14 Jul 25, 2022
@cowtowncoder
Copy link
Member

Ugh. This will require some care merging to 3.0 :)

@pjfanning pjfanning deleted the byte-buffer branch July 25, 2022 19:44
@cowtowncoder
Copy link
Member

Took a while but merging successful. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants