Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

是否是基于自行构建的kuscia镜像启动的节点? #432

Open
guanglaiguo opened this issue Sep 25, 2024 · 6 comments
Open

是否是基于自行构建的kuscia镜像启动的节点? #432

guanglaiguo opened this issue Sep 25, 2024 · 6 comments
Assignees

Comments

@guanglaiguo
Copy link

Issue Type

Others

Search for existing issues similar to yours

Yes

Kuscia Version

kuscia 0.10.0b0

Link to Relevant Documentation

https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.11.0b0/deployment/Docker_deployment_kuscia/deploy_p2p_cn#alice

Question Details

基于官网提供的自行构建kuscia镜像方法(https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.11.0b0/development/build_kuscia_cn)通过make image构建了kuscia镜像(secretflow/kuscia:v0.9.0.dev240508-13-g68280d4-20240919180037),然后基于此镜像,部署alice节点(https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.11.0b0/deployment/Docker_deployment_kuscia/deploy_p2p_cn)验证是否镜像构建成功,依次执行如下命令:
export KUSCIA_IMAGE=secretflow/kuscia:v0.9.0.dev240508-13-g68280d4-20240919180037
export SECRETFLOW_IMAGE=secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/secretflow-lite-anolis8:1.7.0b0
docker run --rm $KUSCIA_IMAGE cat /home/kuscia/scripts/deploy/kuscia.sh > kuscia.sh && chmod u+x kuscia.sh
docker run -it --rm ${KUSCIA_IMAGE} kuscia init --mode autonomy --domain "alice" > autonomy_alice.yaml 2>&1 || autonomy_alice.yaml
./kuscia.sh start -c autonomy_alice.yaml -p 11080 -k 11081
后,打印如下log:
KUSCIA_IMAGE=secretflow/kuscia:v0.9.0.dev240508-13-g68280d4-20240919180037
SECRETFLOW_IMAGE=secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/secretflow-lite-anolis8:1.7.0b0
DATAPROXY_IMAGE=secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/dataproxy:0.1.0b1
ROOT=/
DOMAIN_ID=alice
DOMAIN_WORK_DIR=//root-kuscia-autonomy-alice
DOMAIN_LOG_DIR=//root-kuscia-autonomy-alice/logs
DOMAIN_DATA_DIR=//root-kuscia-autonomy-alice/data
DOMAIN_K3S_DB_DIR=//root-kuscia-autonomy-alice/k3s
DOMAIN_HOST_PORT=11080
DOMAIN_HOST_INTERNAL_PORT=13081
KUSCIAAPI_HTTP_PORT=11081
KUSCIAAPI_GRPC_PORT=13083
METRICS_PORT=13084
Starting container root-kuscia-autonomy-alice ...
k3s data already exists //root-kuscia-autonomy-alice/k3s...
Whether to retain k3s data?(y/n): y
root-kuscia-autonomy-alice-containerd
domain_hostname=root-kuscia-autonomy-alice-localhost-localdomain
network=kuscia-exchange
2dba37c5d7736bf2b9e36c6ceee88c2653ef49ba7ca559d657730dd8bbf4e6e4
Probe datamesh successfully
Image 'secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/secretflow-lite-anolis8:1.7.0b0' already exists in container root-kuscia-autonomy-alice
appimage.kuscia.secretflow/secretflow-image unchanged
appimage.kuscia.secretflow/secretflow-nsjail-image unchanged
Create secretflow app image done
Found the engine image 'secretflow/kuscia:v0.9.0.dev240508-13-g68280d4-20240919180037' on host
Start importing image 'secretflow/kuscia:v0.9.0.dev240508-13-g68280d4-20240919180037' Please be patient...
error: secretflow/kuscia:v0.9.0.dev240508-13-g68280d4-20240919180037 import failed
appimage.kuscia.secretflow/diagnose-image configured
Create diagnose app image done
autonomy domain 'alice' deployed successfully

从上面看,有导入kuscia镜像失败的log,但Alice节点似乎又成功启动了??不确定Alice是否是在自建的kuscia镜像上启动的呢??
@zimu-yuxi
Copy link

1.docker ps看下容器是否正常启动,如果正常启动进kuscia容器内kuscia -v可以看下版本号
2.如果没有正常启动,可以尝试先执行uninstall.sh脚本卸载,然后sh -x kuscia.sh start -c autonomy_alice.yaml -p 11080 -k 11081

@guanglaiguo
Copy link
Author

guanglaiguo commented Sep 25, 2024

@zimu-yuxi 容器能启动,kuscia版本号如下:
[root@localhost /]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2dba37c5d773 secretflow/kuscia:v0.9.0.dev240508-13-g68280d4-20240919180037 "tini -- bin/kuscia …" 3 hours ago Up 4 minutes 0.0.0.0:13081->80/tcp, :::13081->80/tcp, 0.0.0.0:11080->1080/tcp, :::11080->1080/tcp, 0.0.0.0:11081->8082/tcp, :::11081->8082/tcp, 0.0.0.0:13083->8083/tcp, :::13083->8083/tcp, 0.0.0.0:13084->9091/tcp, :::13084->9091/tcp root-kuscia-autonomy-alice
3a8024334b4a moby/buildkit:buildx-stable-1 "buildkitd --allow-i…" 21 hours ago Up 4 minutes buildx_buildkit_kuscia0
[root@localhost /]# docker exec -it 2dba37c5d773 /bin/bash
bash-5.2# kuscia -v
kuscia version v0.9.0.dev240508-13-g68280d4

@zimu-yuxi
Copy link

看版本号就是你自己打的镜像版本,这个报错可能没有什么影响,想要了解下是基于哪个分支的源码打包的镜像。
另外,可以尝试部署另外一个节点,是否会出现相同问题,然后建立路由尝试进行一个任务
有问题可以继续反馈,我们会持续关注

@zimu-yuxi zimu-yuxi self-assigned this Sep 25, 2024
@guanglaiguo
Copy link
Author

guanglaiguo commented Sep 25, 2024

@zimu-yuxi 我是在这里下载的源码
![Uploading Snipaste_2024-09-25_14-43-45.png…](Snipaste_2024-09-25_14-43-45

@zimu-yuxi
Copy link

感谢!您可以尝试部署另一个节点,然后进行任务看下是否有问题。建议不用bob来命名,可以自定义其它名称试下,如下:
docker run -it --rm ${KUSCIA_IMAGE} kuscia init --mode autonomy --domain "bob-test" > autonomy_bob-test.yaml 2>&1 || autonomy_bob-test.yaml

@guanglaiguo
Copy link
Author

@zimu-yuxi 感谢感谢,有问题再跟您请教

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants