Skip to content

8352504: RISC-V: implement and enable CMoveI/L #24490

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

Hamlin-Li
Copy link

@Hamlin-Li Hamlin-Li commented Apr 7, 2025

Hi,
Can you help to review this patch?
On riscv, CMoveI/L already were implemented, but there are some gap:

  1. CMoveI/L does not support comparison with float/double, corresponding tests are not turn on either.
  2. Some optimization of C2 is not turned on, e.g. Phi -> CMove -> min_max.
  3. lack of some corresponding performance tests.

Also there are some issue with current Zicond:

  1. UseZicond is turned on automatically by hwprobe, but jmh tests show that it's not always bring benefit, in some situation it even brings regression, the reason is the generated code by Zicond is much larger than branch version, in particular when it's in a loop and unrolled.

This patch on riscv is to:

  1. add CMoveI/L comparing float/double, and corresponding tests,
  2. enable more C2 optimization,
  3. add more benchmark tests,
  4. turn off UseZicond by default.

Thanks!

Performance

MinMax

test data

Benchmark Mode Cnt Score - master Score - master+UseZbb Score - -master+UseZicond Score - master+UseZicond+UseZbb Score - cmovei Score - cmovei+UseZbb Score - cmovei+UseZicond Score - cmovei+UseZicond+UseZbb Error Units Opt (master/cmovei)
o.o.b.vm.compiler.IfMinMax.testReductionInt avgt 40 17152.075 17216.592 17272.493 17296.89 17127.844 17036.605 17299.333 17250.566 73.179 ns/op 1.001
o.o.b.vm.compiler.IfMinMax.testReductionLong avgt 40 19770.828 19967.578 20268.905 20166.165 20065.552 20059.095 20161.914 20151.295 131.428 ns/op 0.985
o.o.b.vm.compiler.IfMinMax.testSingleInt avgt 40 114.734 114.402 114.887 114.384 114.4 110.631 112.162 110.915 0.333 ns/op 1.003
o.o.b.vm.compiler.IfMinMax.testSingleLong avgt 40 121.53 121.711 120.91 121.665 121.309 120.57 118.639 119.373 0.451 ns/op 1.002
o.o.b.vm.compiler.IfMinMax.testVectorInt avgt 40 60130.165 60062.303 61839.776 61895.194 15887.398 15924.502 15874.835 15667.936 101.94 ns/op 3.785
o.o.b.vm.compiler.IfMinMax.testVectorLong avgt 40 63855.379 63091.646 59838.996 60520.389 31537.961 32637.81 32123.349 32081.008 645.805 ns/op 2.025

Comparison

test data

Benchmark Mode Cnt Score - master Score - master+UseZicond Score - master+UseZicond+UseZbb Score - cmovei Score - cmovei+UseZicond Score - cmovei+UseZicond+UseZbb Error Units Opt (master/cmovei) Opt(master+Zicond/cmovei) Opt(cmovei+Zicond/cmovei)
o.o.b.java.lang.ClassComparison.equalClass avgt 10 2284.019 2281.241 2315.425 2282.523 2278.389 2284.672 16.058 ns/op 1.001 0.999 0.998
o.o.b.java.lang.ClassComparison.notEqualClass avgt 10 2302.202 2277.478 2282.76 2293.061 2277.493 2276.458 26.359 ns/op 1.004 0.993 0.993
o.o.b.java.lang.ClassComparison.notEqualClassResLong avgt 10 2285.068 2297.511 2286.712 2303.156 2288.345 2291.178 6.928 ns/op 0.992 0.998 0.994
o.o.b.java.lang.FPComparison.equalDouble avgt 10 7751.011 7449.124 7485.006 6838.039 8844.953 8869.223 65.813 ns/op 1.134 1.089 1.293
o.o.b.java.lang.FPComparison.equalDoubleResLong avgt 10 7636.465 7862.124 8081.343 7922.318 9495.343 9497.181 58.296 ns/op 0.964 0.992 1.199
o.o.b.java.lang.FPComparison.equalFloat avgt 10 7477.231 7408.795 7504.052 6915.752 8365.907 8369.777 32.23 ns/op 1.081 1.071 1.21
o.o.b.java.lang.FPComparison.equalFloatResLong avgt 10 8137.348 7994.636 8095.031 7736.668 9653.807 9701.424 76.501 ns/op 1.052 1.033 1.248
o.o.b.java.lang.FPComparison.greaterDouble avgt 10 6622.403 6592.067 6662.954 6599.242 6586.418 6637.898 4.94 ns/op 1.004 0.999 0.998
o.o.b.java.lang.FPComparison.greaterDoubleResLong avgt 10 7028.532 7023.135 7031.696 7021.176 7067.086 7015.199 24.72 ns/op 1.001 1 1.007
o.o.b.java.lang.FPComparison.greaterEqualDouble avgt 10 6914.366 6863.646 6918.237 6508.894 8845.28 8850.056 29.055 ns/op 1.062 1.055 1.359
o.o.b.java.lang.FPComparison.greaterEqualDoubleResLong avgt 10 7334.34 7280.929 7540.855 7370.538 9498.004 9547.522 46.422 ns/op 0.995 0.988 1.289
o.o.b.java.lang.FPComparison.greaterEqualFloat avgt 10 6847.878 6935.033 6957.449 6166.204 8370.344 8366.467 22.627 ns/op 1.111 1.125 1.357
o.o.b.java.lang.FPComparison.greaterEqualFloatResLong avgt 10 7547.798 7562.664 7604.377 7285.338 9696.006 9650.559 50.975 ns/op 1.036 1.038 1.331
o.o.b.java.lang.FPComparison.greaterFloat avgt 10 6576.027 6574.319 6577.453 6555.531 6563.859 6535.003 11.263 ns/op 1.003 1.003 1.001
o.o.b.java.lang.FPComparison.greaterFloatResLong avgt 10 7041.001 7132.696 7069.311 7047.881 7022.631 7099.971 36.966 ns/op 0.999 1.012 0.996
o.o.b.java.lang.FPComparison.isFiniteDouble avgt 10 8557.84 8724.791 8512.481 8506.203 9093.229 9092.069 41.254 ns/op 1.006 1.026 1.069
o.o.b.java.lang.FPComparison.isFiniteFloat avgt 10 8400.921 8350.133 8533.231 8581.424 9008.54 9014.456 39.074 ns/op 0.979 0.973 1.05
o.o.b.java.lang.FPComparison.isInfiniteDouble avgt 10 8527.842 8577.015 8700.897 8638.058 9097.99 9096.666 35.878 ns/op 0.987 0.993 1.053
o.o.b.java.lang.FPComparison.isInfiniteFloat avgt 10 8558.592 8606.225 8551.063 8578.587 9013.324 9008.85 42.064 ns/op 0.998 1.003 1.051
o.o.b.java.lang.FPComparison.isNanDouble avgt 10 6111.541 6104.05 6098.749 5557.46 7814.096 7809.632 9.636 ns/op 1.1 1.098 1.406
o.o.b.java.lang.FPComparison.isNanFloat avgt 10 6046.987 6082.279 6024.986 5342.567 7721.771 7725.586 31.425 ns/op 1.132 1.138 1.445
o.o.b.java.lang.FPComparison.lessDouble avgt 10 6640.937 6763.582 6562.656 6599.726 6648.382 6585.565 24.238 ns/op 1.006 1.025 1.007
o.o.b.java.lang.FPComparison.lessDoubleResLong avgt 10 6995.677 7141.626 7034.113 7051.956 7031.507 7053.338 11.337 ns/op 0.992 1.013 0.997
o.o.b.java.lang.FPComparison.lessEqualDouble avgt 10 6946.535 6862.784 6942.766 6565.301 8851.829 8844.927 118.823 ns/op 1.058 1.045 1.348
o.o.b.java.lang.FPComparison.lessEqualDoubleResLong avgt 10 7479.341 7298.585 7443.018 7403.882 9497.064 9489.129 83.851 ns/op 1.01 0.986 1.283
o.o.b.java.lang.FPComparison.lessEqualFloat avgt 10 6833.53 6896.586 6828.014 6222.209 8366.64 8365.264 29.59 ns/op 1.098 1.108 1.345
o.o.b.java.lang.FPComparison.lessEqualFloatResLong avgt 10 7557.333 7447.031 7567.908 7273.652 9708.117 9668.452 46.145 ns/op 1.039 1.024 1.335
o.o.b.java.lang.FPComparison.lessFloat avgt 10 6575.524 6573.098 6718.588 6566.502 6572.681 6577.016 30.16 ns/op 1.001 1.001 1.001
o.o.b.java.lang.FPComparison.lessFloatResLong avgt 10 7124.032 7057.012 7056.314 7057.596 7017.617 7035.338 51.102 ns/op 1.009 1 0.994
o.o.b.java.lang.IntegerComparison.equalInteger avgt 10 2977.815 2930.94 2756.197 2916.888 2928.148 2755.655 33.693 ns/op 1.021 1.005 1.004
o.o.b.java.lang.IntegerComparison.greaterEqualInteger avgt 10 4334.763 4268.061 4316.77 3545.193 4281.587 4300.772 23.535 ns/op 1.223 1.204 1.208
o.o.b.java.lang.IntegerComparison.greaterEqualIntegerResLong avgt 10 4424.988 4590.648 4363.45 4674.027 5806.324 5809.183 37.389 ns/op 0.947 0.982 1.242
o.o.b.java.lang.IntegerComparison.greaterInteger avgt 10 4381.518 4295.115 4454.828 3464.863 4296.909 4288.919 17.829 ns/op 1.265 1.24 1.24
o.o.b.java.lang.IntegerComparison.lessEqualInteger avgt 10 4235.303 4177.298 4255.521 3456.853 4285.183 4285.157 27.386 ns/op 1.225 1.208 1.24
o.o.b.java.lang.IntegerComparison.lessEqualIntegerResLong avgt 10 4192.465 4562.188 4763.328 4316.861 5185.214 5164.703 45.798 ns/op 0.971 1.057 1.201
o.o.b.java.lang.IntegerComparison.lessInteger avgt 10 4242.534 4412.681 4178.328 3293.75 4280.072 4280.075 36.882 ns/op 1.288 1.34 1.299
o.o.b.java.lang.IntegerComparison.notEqualInteger avgt 10 2924.391 2931.704 2756.555 2756.098 2757.606 2922.638 2.542 ns/op 1.061 1.064 1.001
o.o.b.java.lang.IntegerComparison.notEqualIntegerResLong avgt 10 3949.056 3889.019 3914.821 2395.163 3904.716 3905.64 48.017 ns/op 1.649 1.624 1.63
o.o.b.java.lang.LongComparison.equalLong avgt 10 2282.733 2928.679 2282.324 2928.452 2933.275 2979.228 4.929 ns/op 0.78 1 1.002
o.o.b.java.lang.LongComparison.greaterEqualLong avgt 10 4153.027 4223.16 4184.996 4026.854 5001.524 4998.017 19.336 ns/op 1.031 1.049 1.242
o.o.b.java.lang.LongComparison.greaterEqualLongResLong avgt 10 4737.485 4805.149 4504.361 4580.053 5813.558 5801.765 24.003 ns/op 1.034 1.049 1.269
o.o.b.java.lang.LongComparison.greaterLong avgt 10 4304.267 4230.545 4128.306 3873.584 4999.937 4996.964 64.775 ns/op 1.111 1.092 1.291
o.o.b.java.lang.LongComparison.lessEqualLong avgt 10 4122.308 4290.472 4140.419 3903.772 4999.428 5001.281 35.468 ns/op 1.056 1.099 1.281
o.o.b.java.lang.LongComparison.lessEqualLongResLong avgt 10 4844.578 5051.687 4756.432 3822.864 5160.452 5163.626 34.126 ns/op 1.267 1.321 1.35
o.o.b.java.lang.LongComparison.lessLong avgt 10 4187.81 4383.717 4185.369 4013.66 5001.902 5024.348 59.811 ns/op 1.043 1.092 1.246
o.o.b.java.lang.LongComparison.notEqualLong avgt 10 2928.515 2930.576 2929.512 2941.967 2281.817 2928.116 4.607 ns/op 0.995 0.996 0.776
o.o.b.java.lang.LongComparison.notEqualLongResLong avgt 10 2317.146 4044.612 4041.729 2294.701 4046.609 2307.176 93.16 ns/op 1.01 1.763 1.763
o.o.b.java.lang.PointerComparison.equalObject avgt 10 2289.141 2276.513 2294.071 2278.868 2276.323 2358.348 29.932 ns/op 1.005 0.999 0.999
o.o.b.java.lang.PointerComparison.notEqualObject avgt 10 2278.075 2329.446 2277.318 2327.955 2276.14 2276.485 6.149 ns/op 0.979 1.001 0.978
o.o.b.java.lang.PointerComparison.notEqualObjectResLong avgt 10 2296.779 2308.352 2367.808 2309.312 2325.761 2323.996 15.061 ns/op 0.995 1 1.007

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Error

 ⚠️ Found copyright format issue for oracle in [test/hotspot/jtreg/compiler/vectorapi/TestVectorTest.java]

Issues

  • JDK-8352504: RISC-V: implement and enable CMoveI/L (Enhancement - P4)
  • JDK-8346786: RISC-V: Reconsider ConditionalMoveLimit when adding conditional move (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24490/head:pull/24490
$ git checkout pull/24490

Update a local copy of the PR:
$ git checkout pull/24490
$ git pull https://git.openjdk.org/jdk.git pull/24490/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24490

View PR using the GUI difftool:
$ git pr show -t 24490

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24490.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 7, 2025

👋 Welcome back mli! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 7, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 7, 2025
@openjdk
Copy link

openjdk bot commented Apr 7, 2025

@Hamlin-Li The following labels will be automatically applied to this pull request:

  • core-libs
  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@mlbridge
Copy link

mlbridge bot commented Apr 7, 2025

Copy link
Member

@RealFYang RealFYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @Hamlin-Li , Thanks for looking at this part. I once created JBS https://bugs.openjdk.org/browse/JDK-8346786 about ConditionalMoveLimit on RISC-V. I have two questions after a cursory look.

@Hamlin-Li
Copy link
Author

/solves JDK-8346786

@openjdk
Copy link

openjdk bot commented Apr 8, 2025

@Hamlin-Li
Adding additional issue to solves list: 8346786: RISC-V: Reconsider ConditionalMoveLimit when adding conditional move.

@luhenry
Copy link
Member

luhenry commented Apr 8, 2025

the reason is the generated code by Zicond is much larger than branch version

I'm curious about this one. It's surprising to me that we see bigger code generated with Zicond. Do you know why that is the case?

@Hamlin-Li
Copy link
Author

because zicond code is bigger, e.g.

void MacroAssembler::cmov_eq(Register cmp1, Register cmp2, Register dst, Register src) {
  if (UseZicond) {
    xorr(t0, cmp1, cmp2);
    czero_eqz(dst, dst, t0);
    czero_nez(t1 , src, t0);
    orr(dst, dst, t1);
    return;
  }
  Label no_set;
  bne(cmp1, cmp2, no_set);
  mv(dst, src);
  bind(no_set);
}

@openjdk
Copy link

openjdk bot commented Apr 10, 2025

@Hamlin-Li this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout cmoveil-v1
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Apr 10, 2025
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Apr 10, 2025
Copy link
Member

@RealFYang RealFYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. Several minor comments remain :-)

Copy link
Member

@feilongjiang feilongjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

Copy link
Member

@RealFYang RealFYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated change LGTM. Thanks.

@Hamlin-Li
Copy link
Author

Thank you @feilongjiang @RealFYang !

@Hamlin-Li
Copy link
Author

/integrate

@openjdk
Copy link

openjdk bot commented Apr 15, 2025

@Hamlin-Li This pull request has not yet been marked as ready for integration.

@@ -1,5 +1,5 @@
/*
* Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2022, 2025,Oracle and/or its affiliates. All rights reserved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The jcheck is complaining about a copyright format issue here. There should be a space after 2025,.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants