-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multi-rails hang io #10420
Comments
|
@ivanallen can you pls post the output with UCX_PROTO_INFO=y and multi-rail? |
@yosefe Hi, for detail log with UCX_LOG_LEVEL=data, see ucx.log.txt.
|
Can you pls try adding UCX_IB_ROCE_LOCAL_SUBNET=y? |
@yosefe When I added this configuration, the client crashed.
|
@ivanallen can you pls set the same UCX env vars for the sever and the client? |
In this environment, I have executed
server:
client:
|
@ivanallen can you pls try without the "-e" parameter? |
@yosefe It works! Why? |
Describe the bug
When I use multi-rail, ucx_perf will hang. When the ucx_perf is specified 'UCX_NET_DEVICES=mlx5_bond_0:1' or 'UCX_NET_DEVICES=mlx5_bond_1:1', it works properly.
Steps to Reproduce
Setup and versions
MLNX_OFED_LINUX-5.8-4.1.5.0
ibstat
oribv_devinfo -vv
commandAdditional information (depending on the issue)
client
server
client log:
ucx.log.txt
The text was updated successfully, but these errors were encountered: