Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

查询job状态的时候,kuscia返回error #391

Open
z00174311 opened this issue Jul 24, 2024 · 1 comment
Open

查询job状态的时候,kuscia返回error #391

z00174311 opened this issue Jul 24, 2024 · 1 comment

Comments

@z00174311
Copy link

Issue Type

Api Usage

Search for existing issues similar to yours

Yes

Kuscia Version

0.7.0b0

Link to Relevant Documentation

No response

Question Details

在alice端create job之后,调用job querry进行查询job状态,反馈failed,相关error信息如下
“err_msg”:“The remaining no-failed party ta.sk counts 1are less than the threshold 2 that meets the conditions for ta.sk su.ccess.pending partyl],
running party[alice-partner],successfulpartyl],failed rparty [lbolb-partner]”,
请问如何定位bob端的失败原因?
@lanyy9527
Copy link

您好,根据您提供的日志显示bob端任务失败导致,您可以根据下列信息进行排查:

  1. 在bob端的kuscia容器中,kubectl get kt job-name -n cross-domain -o yaml查看相关任务的报错日志信息;
  2. 检查bob端的资源状态(如内存、CPU、磁盘空间)是否充足;
  3. 通过docker stats检查bob端kuscia的容器资源是否设置为大于6G,如果不满足可以使用docker update --memory 调整内存资源;
  4. 检查kuscia容器中是否存在大量error的pod,error的pod可能会对资源有影响的,需要及时清理;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants