Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]docker logs 一直提示Triton 正在启动 #36

Open
2 tasks done
misslxs opened this issue Jan 19, 2024 · 5 comments
Open
2 tasks done

[BUG]docker logs 一直提示Triton 正在启动 #36

misslxs opened this issue Jan 19, 2024 · 5 comments

Comments

@misslxs
Copy link

misslxs commented Jan 19, 2024

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

  • 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

  • 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

从日志来看所有的服务均启动成功,但curl -s -w "%{http_code}" http://localhost:10000/v2/health/ready -o /dev/null) 检测一直不通过。超时后容器停止后也没有/model_repos/QAEnsemble_base/QAEnsemble_base.log 这个日志文件。

iShot_2024-01-19_09 30 13

期望行为 | Expected Behavior

No response

运行环境 | Environment

- OS: ubuntu 22.04 x86
- NVIDIA Driver: 535.146.02
- CUDA:12.2
- Docker Compose:v2.24.0-birthday.10
- NVIDIA GPU Memory:16GB

QAnything日志 | QAnything logs

root@f1376869a3c5:/workspace/qanything_local# cat api.log
UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content
rerank_port: 10001
embed_port: 10001
[2024-01-19 09:56:17 +0800] [91] [INFO] Sanic v23.6.0
[2024-01-19 09:56:17 +0800] [91] [INFO] Goin' Fast @ http://0.0.0.0:8777
[2024-01-19 09:56:17 +0800] [91] [INFO] mode: production, w/ 4 workers
[2024-01-19 09:56:17 +0800] [91] [INFO] server: sanic, HTTP/1.1
[2024-01-19 09:56:17 +0800] [91] [INFO] python: 3.10.12
[2024-01-19 09:56:17 +0800] [91] [INFO] platform: Linux-6.5.0-14-generic-x86_64-with-glibc2.35
[2024-01-19 09:56:17 +0800] [91] [INFO] packages: sanic-routing==23.12.0, sanic-ext==23.6.0
UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content
rerank_port: 10001
embed_port: 10001
[2024-01-19 09:56:27 +0800] [658] [INFO] Sanic Extensions:
[2024-01-19 09:56:27 +0800] [658] [INFO] > injection [0 dependencies; 0 constants]
[2024-01-19 09:56:27 +0800] [658] [INFO] > openapi [http://0.0.0.0:8777/docs]
[2024-01-19 09:56:27 +0800] [658] [INFO] > http
[2024-01-19 09:56:27 +0800] [658] [INFO] > templating [jinja2==3.1.3]
UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content
rerank_port: 10001
embed_port: 10001
[2024-01-19 09:56:27 +0800] [657] [INFO] Sanic Extensions:
[2024-01-19 09:56:27 +0800] [657] [INFO] > injection [0 dependencies; 0 constants]
[2024-01-19 09:56:27 +0800] [657] [INFO] > openapi [http://0.0.0.0:8777/docs]
[2024-01-19 09:56:27 +0800] [657] [INFO] > http
[2024-01-19 09:56:27 +0800] [657] [INFO] > templating [jinja2==3.1.3]
UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content
rerank_port: 10001
embed_port: 10001
[2024-01-19 09:56:27 +0800] [659] [INFO] Sanic Extensions:
[2024-01-19 09:56:27 +0800] [659] [INFO] > injection [0 dependencies; 0 constants]
[2024-01-19 09:56:27 +0800] [659] [INFO] > openapi [http://0.0.0.0:8777/docs]
[2024-01-19 09:56:27 +0800] [659] [INFO] > http
[2024-01-19 09:56:27 +0800] [659] [INFO] > templating [jinja2==3.1.3]
init local_doc_qa in local
init local_doc_qa in local
UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content
rerank_port: 10001
embed_port: 10001
[2024-01-19 09:56:27 +0800] [660] [INFO] Sanic Extensions:
[2024-01-19 09:56:27 +0800] [660] [INFO] > injection [0 dependencies; 0 constants]
[2024-01-19 09:56:27 +0800] [660] [INFO] > openapi [http://0.0.0.0:8777/docs]
[2024-01-19 09:56:27 +0800] [660] [INFO] > http
[2024-01-19 09:56:27 +0800] [660] [INFO] > templating [jinja2==3.1.3]
init local_doc_qa in local
init local_doc_qa in local
[2024-01-19 09:56:27 +0800] [658] [INFO] Starting worker [658]
[2024-01-19 09:56:27 +0800] [657] [INFO] Starting worker [657]
[2024-01-19 09:56:27 +0800] [659] [INFO] Starting worker [659]
[2024-01-19 09:56:27 +0800] [660] [INFO] Starting worker [660]

复现方法 | Steps To Reproduce

No response

备注 | Anything else?

No response

@misslxs misslxs changed the title [BUG] <title> [BUG]docker logs 一直提示Triton 正在启动 Jan 19, 2024
@YinSonglin1997
Copy link

补充一下,我和楼主同样的问题,我把QAEnsemble.log贴出来。
I0119 02:05:18.197207 86 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f9e5c000000' with size 268435456
I0119 02:05:18.201188 86 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0119 02:05:18.208520 86 model_lifecycle.cc:462] loading: rerank:1
I0119 02:05:18.208561 86 model_lifecycle.cc:462] loading: embed:1
I0119 02:05:18.208588 86 model_lifecycle.cc:462] loading: base:1
I0119 02:05:18.211636 86 onnxruntime.cc:2504] TRITONBACKEND_Initialize: onnxruntime
I0119 02:05:18.211702 86 onnxruntime.cc:2514] Triton TRITONBACKEND API version: 1.12
I0119 02:05:18.211721 86 onnxruntime.cc:2520] 'onnxruntime' TRITONBACKEND API version: 1.12
I0119 02:05:18.211736 86 onnxruntime.cc:2550] backend configuration:
{"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}}
I0119 02:05:18.277019 86 onnxruntime.cc:2608] TRITONBACKEND_ModelInitialize: rerank (version 1)
I0119 02:05:18.277589 86 onnxruntime.cc:2608] TRITONBACKEND_ModelInitialize: embed (version 1)
I0119 02:05:18.277767 86 onnxruntime.cc:666] skipping model configuration auto-complete for 'rerank': inputs and outputs already specified
I0119 02:05:18.278371 86 onnxruntime.cc:2651] TRITONBACKEND_ModelInstanceInitialize: rerank (GPU device 0)
I0119 02:05:18.278735 86 onnxruntime.cc:666] skipping model configuration auto-complete for 'embed': inputs and outputs already specified
I0119 02:05:18.280363 86 onnxruntime.cc:2651] TRITONBACKEND_ModelInstanceInitialize: embed (GPU device 0)
I0119 02:05:18.758885 86 libfastertransformer.cc:459] Before Loading Weights:
terminate called after throwing an instance of 'std::length_error'
what(): basic_string::_M_create
[d46a4f8365f8:00086] *** Process received signal ***
[d46a4f8365f8:00086] Signal: Aborted (6)
[d46a4f8365f8:00086] Signal code: (-6)
[d46a4f8365f8:00086] [ 0] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f9eab095520]
[d46a4f8365f8:00086] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f9eab0e99fc]
[d46a4f8365f8:00086] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f9eab095476]
[d46a4f8365f8:00086] [ 3] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f9eab07b7f3]
[d46a4f8365f8:00086] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2b9e)[0x7f9eab31db9e]
[d46a4f8365f8:00086] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae20c)[0x7f9eab32920c]
[d46a4f8365f8:00086] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae277)[0x7f9eab329277]
[d46a4f8365f8:00086] [ 7] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae4d8)[0x7f9eab3294d8]
[d46a4f8365f8:00086] [ 8] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZSt20__throw_length_errorPKc+0x40)[0x7f9eab320449]
[d46a4f8365f8:00086] [ 9] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x14bc69)[0x7f9eab3c6c69]
[d46a4f8365f8:00086] [10] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(+0xa6ba3c)[0x7f9e1dbf2a3c]
[d46a4f8365f8:00086] [11] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN17fastertransformer21loadWeightFromBinFuncI6__halfS1_EEiPT_St6vectorImSaImEENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x187)[0x7f9e1dc0b227]
[d46a4f8365f8:00086] [12] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN17fastertransformer17loadWeightFromBinI6__halfEEiPT_St6vectorImSaImEENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_14FtCudaDataTypeE+0x282)[0x7f9e1dc0ed12]
[d46a4f8365f8:00086] [13] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN17fastertransformer11LlamaWeightI6__halfE16loadEncryptModelENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x184)[0x7f9e1d7cb0b4]
[d46a4f8365f8:00086] [14] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN16LlamaTritonModelI6__halfE19createSharedWeightsEii+0x2ad)[0x7f9e1d7b219d]
[d46a4f8365f8:00086] [15] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253)[0x7f9eab357253]
[d46a4f8365f8:00086] [16] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3)[0x7f9eab0e7ac3]
[d46a4f8365f8:00086] [17] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x126660)[0x7f9eab179660]
[d46a4f8365f8:00086] *** End of error message ***

@yydxlv
Copy link

yydxlv commented Jan 19, 2024

Triton服务同样显示启动失败,进入容器内检查/model_repos/QAEnsemble_base/QAEnsemble_base.log 发现:nohup: failed to run command '/opt/tritonserver/bin/tritonserver': No such file or directory

@xixihahaliu
Copy link
Collaborator

Triton服务同样显示启动失败,进入容器内检查/model_repos/QAEnsemble_base/QAEnsemble_base.log 发现:nohup: failed to run command '/opt/tritonserver/bin/tritonserver': No such file or directory

可以贴出完整的log文件吗?方便排查,另外可以看下FAQ_zh.md,可能存在帮助

@xixihahaliu
Copy link
Collaborator

补充一下,我和楼主同样的问题,我把QAEnsemble.log贴出来。 I0119 02:05:18.197207 86 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f9e5c000000' with size 268435456 I0119 02:05:18.201188 86 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864 I0119 02:05:18.208520 86 model_lifecycle.cc:462] loading: rerank:1 I0119 02:05:18.208561 86 model_lifecycle.cc:462] loading: embed:1 I0119 02:05:18.208588 86 model_lifecycle.cc:462] loading: base:1 I0119 02:05:18.211636 86 onnxruntime.cc:2504] TRITONBACKEND_Initialize: onnxruntime I0119 02:05:18.211702 86 onnxruntime.cc:2514] Triton TRITONBACKEND API version: 1.12 I0119 02:05:18.211721 86 onnxruntime.cc:2520] 'onnxruntime' TRITONBACKEND API version: 1.12 I0119 02:05:18.211736 86 onnxruntime.cc:2550] backend configuration: {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}} I0119 02:05:18.277019 86 onnxruntime.cc:2608] TRITONBACKEND_ModelInitialize: rerank (version 1) I0119 02:05:18.277589 86 onnxruntime.cc:2608] TRITONBACKEND_ModelInitialize: embed (version 1) I0119 02:05:18.277767 86 onnxruntime.cc:666] skipping model configuration auto-complete for 'rerank': inputs and outputs already specified I0119 02:05:18.278371 86 onnxruntime.cc:2651] TRITONBACKEND_ModelInstanceInitialize: rerank (GPU device 0) I0119 02:05:18.278735 86 onnxruntime.cc:666] skipping model configuration auto-complete for 'embed': inputs and outputs already specified I0119 02:05:18.280363 86 onnxruntime.cc:2651] TRITONBACKEND_ModelInstanceInitialize: embed (GPU device 0) I0119 02:05:18.758885 86 libfastertransformer.cc:459] Before Loading Weights: terminate called after throwing an instance of 'std::length_error' what(): basic_string::_M_create [d46a4f8365f8:00086] *** Process received signal *** [d46a4f8365f8:00086] Signal: Aborted (6) [d46a4f8365f8:00086] Signal code: (-6) [d46a4f8365f8:00086] [ 0] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f9eab095520] [d46a4f8365f8:00086] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f9eab0e99fc] [d46a4f8365f8:00086] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f9eab095476] [d46a4f8365f8:00086] [ 3] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f9eab07b7f3] [d46a4f8365f8:00086] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2b9e)[0x7f9eab31db9e] [d46a4f8365f8:00086] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae20c)[0x7f9eab32920c] [d46a4f8365f8:00086] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae277)[0x7f9eab329277] [d46a4f8365f8:00086] [ 7] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae4d8)[0x7f9eab3294d8] [d46a4f8365f8:00086] [ 8] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZSt20__throw_length_errorPKc+0x40)[0x7f9eab320449] [d46a4f8365f8:00086] [ 9] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x14bc69)[0x7f9eab3c6c69] [d46a4f8365f8:00086] [10] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(+0xa6ba3c)[0x7f9e1dbf2a3c] [d46a4f8365f8:00086] [11] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN17fastertransformer21loadWeightFromBinFuncI6__halfS1_EEiPT_St6vectorImSaImEENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x187)[0x7f9e1dc0b227] [d46a4f8365f8:00086] [12] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN17fastertransformer17loadWeightFromBinI6__halfEEiPT_St6vectorImSaImEENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_14FtCudaDataTypeE+0x282)[0x7f9e1dc0ed12] [d46a4f8365f8:00086] [13] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN17fastertransformer11LlamaWeightI6__halfE16loadEncryptModelENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x184)[0x7f9e1d7cb0b4] [d46a4f8365f8:00086] [14] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN16LlamaTritonModelI6__halfE19createSharedWeightsEii+0x2ad)[0x7f9e1d7b219d] [d46a4f8365f8:00086] [15] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253)[0x7f9eab357253] [d46a4f8365f8:00086] [16] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3)[0x7f9eab0e7ac3] [d46a4f8365f8:00086] [17] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x126660)[0x7f9eab179660] [d46a4f8365f8:00086] *** End of error message ***

  • 原因2:如果发现显存够用,那是因为新版模型与部分显卡型号不兼容。
  • 解决方案:请更换为兼容模型和镜像,手动下载模型文件解压并替换models目录,然后重启服务即可。
    • 将docker-compose-xxx.yaml中的freeren/qanyxxx:v1.0.9改为freeren/qanyxxx:v1.0.8
    • git clone https://www.wisemodel.cn/Netease_Youdao/qanything.git
    • cd qanything
    • git reset --hard 79b3da3bbb35406f0b2da3acfcdb4c96c2837faf
    • unzip models.zip
    • 替换掉现有的models目录

可以尝试上述解决方案,另外部分显卡型号不支持当前模型,请提前确认,在显存足够的前提下,目前已确认支持的显卡包括Nvidia 2080Ti,30系,40系,A30,A40,A100

@xixihahaliu
Copy link
Collaborator

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

  • 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

  • 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

从日志来看所有的服务均启动成功,但curl -s -w "%{http_code}" http://localhost:10000/v2/health/ready -o /dev/null) 检测一直不通过。超时后容器停止后也没有/model_repos/QAEnsemble_base/QAEnsemble_base.log 这个日志文件。

iShot_2024-01-19_09 30 13

期望行为 | Expected Behavior

No response

运行环境 | Environment

- OS: ubuntu 22.04 x86
- NVIDIA Driver: 535.146.02
- CUDA:12.2
- Docker Compose:v2.24.0-birthday.10
- NVIDIA GPU Memory:16GB

QAnything日志 | QAnything logs

root@f1376869a3c5:/workspace/qanything_local# cat api.log UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content rerank_port: 10001 embed_port: 10001 [2024-01-19 09:56:17 +0800] [91] [INFO] Sanic v23.6.0 [2024-01-19 09:56:17 +0800] [91] [INFO] Goin' Fast @ http://0.0.0.0:8777 [2024-01-19 09:56:17 +0800] [91] [INFO] mode: production, w/ 4 workers [2024-01-19 09:56:17 +0800] [91] [INFO] server: sanic, HTTP/1.1 [2024-01-19 09:56:17 +0800] [91] [INFO] python: 3.10.12 [2024-01-19 09:56:17 +0800] [91] [INFO] platform: Linux-6.5.0-14-generic-x86_64-with-glibc2.35 [2024-01-19 09:56:17 +0800] [91] [INFO] packages: sanic-routing==23.12.0, sanic-ext==23.6.0 UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content rerank_port: 10001 embed_port: 10001 [2024-01-19 09:56:27 +0800] [658] [INFO] Sanic Extensions: [2024-01-19 09:56:27 +0800] [658] [INFO] > injection [0 dependencies; 0 constants] [2024-01-19 09:56:27 +0800] [658] [INFO] > openapi [http://0.0.0.0:8777/docs] [2024-01-19 09:56:27 +0800] [658] [INFO] > http [2024-01-19 09:56:27 +0800] [658] [INFO] > templating [jinja2==3.1.3] UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content rerank_port: 10001 embed_port: 10001 [2024-01-19 09:56:27 +0800] [657] [INFO] Sanic Extensions: [2024-01-19 09:56:27 +0800] [657] [INFO] > injection [0 dependencies; 0 constants] [2024-01-19 09:56:27 +0800] [657] [INFO] > openapi [http://0.0.0.0:8777/docs] [2024-01-19 09:56:27 +0800] [657] [INFO] > http [2024-01-19 09:56:27 +0800] [657] [INFO] > templating [jinja2==3.1.3] UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content rerank_port: 10001 embed_port: 10001 [2024-01-19 09:56:27 +0800] [659] [INFO] Sanic Extensions: [2024-01-19 09:56:27 +0800] [659] [INFO] > injection [0 dependencies; 0 constants] [2024-01-19 09:56:27 +0800] [659] [INFO] > openapi [http://0.0.0.0:8777/docs] [2024-01-19 09:56:27 +0800] [659] [INFO] > http [2024-01-19 09:56:27 +0800] [659] [INFO] > templating [jinja2==3.1.3] init local_doc_qa in local init local_doc_qa in local UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content rerank_port: 10001 embed_port: 10001 [2024-01-19 09:56:27 +0800] [660] [INFO] Sanic Extensions: [2024-01-19 09:56:27 +0800] [660] [INFO] > injection [0 dependencies; 0 constants] [2024-01-19 09:56:27 +0800] [660] [INFO] > openapi [http://0.0.0.0:8777/docs] [2024-01-19 09:56:27 +0800] [660] [INFO] > http [2024-01-19 09:56:27 +0800] [660] [INFO] > templating [jinja2==3.1.3] init local_doc_qa in local init local_doc_qa in local [2024-01-19 09:56:27 +0800] [658] [INFO] Starting worker [658] [2024-01-19 09:56:27 +0800] [657] [INFO] Starting worker [657] [2024-01-19 09:56:27 +0800] [659] [INFO] Starting worker [659] [2024-01-19 09:56:27 +0800] [660] [INFO] Starting worker [660]

复现方法 | Steps To Reproduce

No response

备注 | Anything else?

No response

目前单卡启动和双卡启动的日志文件位置不同,因为单卡启动多个tritonserver服务会同时启动,节省显存,目前看你应该是单卡启动的,请贴出/model_repos/QAEnsemble/QAEnsemble.log的详细内容,这里应该会有更多信息

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants