Skip to content

Optimize skip_slice by recording serialized length #1836

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 21, 2017

Conversation

Gankra
Copy link
Contributor

@Gankra Gankra commented Oct 10, 2017

For @jrmuizel to test.


This change is Reviewable

@glennw
Copy link
Member

glennw commented Oct 12, 2017

@gankro do we want to review / merge this?

1 similar comment
@glennw
Copy link
Member

glennw commented Oct 12, 2017

@gankro do we want to review / merge this?

@Gankra
Copy link
Contributor Author

Gankra commented Oct 13, 2017

Still need to do perf eval, gotten distracted the last few days.

@jrmuizel
Copy link
Collaborator

I tried performance testing this. I couldn't get solid numbers. WebRender seems to have some weird bimodal timing behaviour that makes measuring header.

@glennw
Copy link
Member

glennw commented Oct 19, 2017

@gankro I can try to do some profiling with this tomorrow, if that would be helpful?

@glennw
Copy link
Member

glennw commented Oct 20, 2017

I did some simple profiling on one of our benchmarks, with the following:

vblank_mode=0 ../target/release/wrench -r show benchmarks/text-rendering.yaml

This uses -r to replay the display list every frame, and vblank_mode=0 disables vsync on Intel GPUs. The benchmark draws 64 reasonably long text runs of varying colors and font sizes, so it contains a lot of glyphs in the display list.

Without: ~325 fps.
With: ~385 fps.

So, at least in this test case, it's a very significant win. It'd be interesting to see if others can reproduce similar results.

With this patch, this is what the profile graph looks look:

profile

The two areas I've marked with a red dashed line are time spent inside deserialization functions / iterators, so hopefully there's still quite a lot of wins to be found there. There's certainly a couple of other big targets in that profile I can look at performance-wise in the future too.

@Gankra
Copy link
Contributor Author

Gankra commented Oct 20, 2017

Wow that's really surprising since text is one of the slices that shouldn't actually need this optimization (with a Sufficiently Smart Compiler).

Seems pretty convincing to me it's a good win. Let's ship it.

@glennw
Copy link
Member

glennw commented Oct 20, 2017

@bors-servo r+

@bors-servo
Copy link
Contributor

📌 Commit 13bfa65 has been approved by glennw

@bors-servo
Copy link
Contributor

⌛ Testing commit 13bfa65 with merge 517ef53...

bors-servo pushed a commit that referenced this pull request Oct 20, 2017
Optimize skip_slice by recording serialized length

For @jrmuizel to test.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/1836)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

☀️ Test successful - status-appveyor, status-travis
Approved by: glennw
Pushing 517ef53 to master...

@bors-servo bors-servo merged commit 13bfa65 into servo:master Oct 21, 2017
@jrmuizel
Copy link
Collaborator

serde-rs/serde#855 is probably the next place where we're going to get deserialization wins. @gankro will look at it when he gets a chance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants