Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Connectivity]已部署 ollama + UI_TARS,如何使用 #309

Open
supperdsj opened this issue Jan 22, 2025 · 24 comments
Open

[Connectivity]已部署 ollama + UI_TARS,如何使用 #309

supperdsj opened this issue Jan 22, 2025 · 24 comments
Assignees

Comments

@supperdsj
Copy link

使用浏览器配置如下:
MIDSCENE_USE_VLM_UI_TARS=1
OPENAI_BASE_URL=http://***:11434/
MIDSCENE_MODEL_NAME=hf.co/bytedance-research/UI-TARS-7B-gguf:latest
OPENAI_API_KEY=111

在浏览器插件的 action 输入:
click '介绍'

得到的错误:
403 status code (no body)
Error: 403 status code (no body)
at e.generate (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3090002)
at f.makeStatusError (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3078477)
at f.makeRequest (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3079593)
at async F (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4243928)
at async __commonJS.../midscene/dist/lib/chunk-CERQVVPJ.js.e.vlmPlanning (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4261672)
at async Object.executor (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4319942)
at async __commonJS.../midscene/dist/lib/index.js.e.Executor.flush (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4271106)
at async G.actionToGoal (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4321443)
at async Y.aiAction (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4326763)
at async Object.onClick (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:5609123)

@supperdsj
Copy link
Author

使用以下环境变量执行 ebay demo 代码:
export MIDSCENE_USE_VLM_UI_TARS="1" && export OPENAI_BASE_URL="http://****:11434" && export OPENAI_API_KEY ="1" && export MIDSCENE_MODEL_NAME="hf.co/bytedance-research/UI-TARS-7B-gguf:latest" && node midscene.js

得到错误:
Error: 404 404 page not found
Error: 404 404 page not found
at APIError.generate (/Users/user/fe/server/node_modules/@midscene/core/node_modules/openai/error.js:54:20)
at OpenAI.makeStatusError (/Users/user/fe/server/node_modules/@midscene/core/node_modules/openai/core.js:275:33)
at OpenAI.makeRequest (/Users/user/fe/server/node_modules/@midscene/core/node_modules/openai/core.js:318:30)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async call (/Users/user/fe/server/node_modules/@midscene/core/dist/lib/chunk-CERQVVPJ.js:2253:20)
at async vlmPlanning (/Users/user/fe/server/node_modules/@midscene/core/dist/lib/chunk-CERQVVPJ.js:2647:15)
at async Object.executor (/Users/user/fe/server/node_modules/@midscene/web/dist/lib/puppeteer.js:966:28)
at async Executor.flush (/Users/user/fe/server/node_modules/@midscene/core/dist/lib/index.js:119:25)
at async PageTaskExecutor.actionToGoal (/Users/user/fe/server/node_modules/@midscene/web/dist/lib/puppeteer.js:1070:22)
at async PuppeteerAgent.aiAction (/Users/user/fe/server/node_modules/@midscene/web/dist/lib/puppeteer.js:1511:28)
at PuppeteerAgent.aiAction (/Users/user/fe/server/node_modules/@midscene/web/dist/lib/puppeteer.js:1518:15)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async /Users/user/fe/server/playground/midscene.js:29:5

Node.js v18.20.4

@yuyutaotao
Copy link
Collaborator

在 url 最后配上 /v1 试试,OPENAI_BASE_URL=http://***:11434/v1

如果还不行,麻烦再留言给我们你的版本号。(浏览器插件的版本号在侧栏最底部。)

@supperdsj
Copy link
Author

环境变量:
MIDSCENE_USE_VLM_UI_TARS=1
OPENAI_BASE_URL=http://127.0.0.1:11434/v1
MIDSCENE_MODEL_NAME=hf.co/bytedance-research/UI-TARS-7B-gguf:latest
OPENAI_API_KEY=111

版本号:
Midscene.js Chrome Extension v0.24 (SDK v0.10.0)

错误:
403 status code (no body)
Error: 403 status code (no body)
at e.generate (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3090002)
at f.makeStatusError (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3078477)
at f.makeRequest (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3079593)
at async F (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4243928)
at async e.vlmPlanning (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4261672)
at async Object.executor (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4319942)
at async e.Executor.flush (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4271106)
at async G.actionToGoal (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4321443)
at async Y.aiAction (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4326763)
at async Object.onClick (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:5609123)

@st01cs
Copy link

st01cs commented Jan 23, 2025

环境变量: MIDSCENE_USE_VLM_UI_TARS=1 OPENAI_BASE_URL=http://127.0.0.1:11434/v1 MIDSCENE_MODEL_NAME=hf.co/bytedance-research/UI-TARS-7B-gguf:latest OPENAI_API_KEY=111

版本号: Midscene.js Chrome Extension v0.24 (SDK v0.10.0)

错误: 403 status code (no body) Error: 403 status code (no body) at e.generate (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3090002) at f.makeStatusError (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3078477) at f.makeRequest (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3079593) at async F (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4243928) at async e.vlmPlanning (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4261672) at async Object.executor (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4319942) at async e.Executor.flush (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4271106) at async G.actionToGoal (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4321443) at async Y.aiAction (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4326763) at async Object.onClick (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:5609123)

同样的问题,插件版本:
Midscene.js Chrome Extension v0.24 (SDK v0.10.0)

试用 connectivity-test,结果:
❯ tests/connectivity.test.ts (4) 80535ms
❯ Use OpenAI SDK directly (2) 60040ms
× basic call with ui-tars:latest 30022ms
× image input with ui-tars:latest 30016ms
❯ Use Midscene wrapped OpenAI SDK (1) 20494ms
× call to get json object 20494ms
↓ Azure OpenAI Service by ADT Credential (1) [skipped]
↓ basic call [skipped]

ui-tars-desktop 可以联通ollama server,但会在几次http status 200后,以http status 500 结束。

@yuyutaotao
Copy link
Collaborator

@st01cs 部署在本地 ollama 上的吗?

connectivity-test 的问题好解决,看报错就是超时了,你把超时调大就行。代码里搜 timeout,配到 240 * 1000 以上再试试。

浏览器环境的问题我们还在复现。

@st01cs
Copy link

st01cs commented Jan 23, 2025

240 * 1000

timeout调大后ollama在两个测试中返回500:

Error: 500 an error was encountered while running the model: GGML_ASSERT(sections.v[0] > 0 || sections.v[1] > 0 || sections.v[2] > 0) failed
❯ Function.generate node_modules/openai/src/error.ts:98:14
❯ OpenAI.makeStatusError node_modules/openai/src/core.ts:397:21
❯ OpenAI.makeRequest node_modules/openai/src/core.ts:460:24
❯ tests/connectivity.test.ts:38:22
36| baseURL: process.env.OPENAI_BASE_URL,
37| });
38| const response = await openai.chat.completions.create({
| ^
39| model: model,
40| messages: [{ role: "user", content: "Hello, how are you?" }],

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[1/2]⎯

FAIL tests/connectivity.test.ts > Use Midscene wrapped OpenAI SDK > call to get json object
Error: 500 an error was encountered while running the model: GGML_ASSERT(sections.v[0] > 0 || sections.v[1] > 0 || sections.v[2] > 0) failed
❯ Function.generate node_modules/openai/src/error.ts:98:14
❯ OpenAI.makeStatusError node_modules/openai/src/core.ts:397:21
❯ OpenAI.makeRequest node_modules/openai/src/core.ts:460:24
❯ call node_modules/@midscene/core/dist/lib/chunk-CERQVVPJ.js:2253:20
❯ Proxy.callToGetJSONObject node_modules/@midscene/core/dist/lib/chunk-CERQVVPJ.js:2327:20
❯ tests/connectivity.test.ts:76:20
74| describe("Use Midscene wrapped OpenAI SDK", () => {
75| it("call to get json object", async () => {
76| const result = await callToGetJSONObject(
| ^
77| [
78| {

@yuyutaotao
Copy link
Collaborator

前一个测试 500 是不应该的,能看到有什么错误日志么?

“ Use Midscene wrapped OpenAI SDK” 这个可以不用管,VLM 模式不是 JSON 了

@yuyutaotao
Copy link
Collaborator

浏览器插件报 403 的问题找到了,需要启用跨域配置

OLLAMA_HOST="0.0.0.0" OLLAMA_ORIGINS="*" ollama serve

@st01cs
Copy link

st01cs commented Jan 23, 2025

浏览器插件报 403 的问题找到了,需要启用跨域配置

OLLAMA_HOST="0.0.0.0" OLLAMA_ORIGINS="*" ollama serve

403 的问题解决了,返回新的错误:

Cannot read properties of undefined (reading 'thought')
TypeError: Cannot read properties of undefined (reading 'thought')
at Object.executor (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4320200)
at async e.Executor.flush (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4271106)
at async G.actionToGoal (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4321443)
at async Y.aiAction (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4326763)
at async Object.onClick (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:5609123)

@st01cs
Copy link

st01cs commented Jan 23, 2025

前一个测试 500 是不应该的,能看到有什么错误日志么?

“ Use Midscene wrapped OpenAI SDK” 这个可以不用管,VLM 模式不是 JSON 了

ollama gguf 版本的问题挺多的,看不到错误日志

@yuyutaotao
Copy link
Collaborator

嗯,这感觉是模型返回的问题了...

你的运行环境是什么?什么样的机器配置?我们正在沟通部署的最佳实践,看怎么能把模型的效果发挥出来

@yuyutaotao
Copy link
Collaborator

日志可以加上 OLLAMA_DEBUG=1 试试

@st01cs
Copy link

st01cs commented Jan 23, 2025

嗯,这感觉是模型返回的问题了...

你的运行环境是什么?什么样的机器配置?我们正在沟通部署的最佳实践,看怎么能把模型的效果发挥出来

之前的实验配置:
ollama 部署在N-4090 24G 的机器上

再找一台多个显卡的机器试一下vllm server,正在下载模型

@st01cs
Copy link

st01cs commented Jan 23, 2025

日志可以加上 OLLAMA_DEBUG=1 试试

OK。

顺便问一下 Android Scene 是用什么工具测试的?

@supperdsj
Copy link
Author

浏览器插件报 403 的问题找到了,需要启用跨域配置
OLLAMA_HOST="0.0.0.0" OLLAMA_ORIGINS="*" ollama serve

403 的问题解决了,返回新的错误:

Cannot read properties of undefined (reading 'thought') TypeError: Cannot read properties of undefined (reading 'thought') at Object.executor (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4320200) at async e.Executor.flush (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4271106) at async G.actionToGoal (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4321443) at async Y.aiAction (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4326763) at async Object.onClick (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:5609123)

403 是如何解决的,配 ollama 所在机器的环境变量?

@st01cs
Copy link

st01cs commented Jan 23, 2025

OLLAMA_ORIGINS

是的,添加OLLAMA_ORIGINS环境变量

@Etherdrake
Copy link

前一个测试 500 是不应该的,能看到有什么错误日志么?
“ Use Midscene wrapped OpenAI SDK” 这个可以不用管,VLM 模式不是 JSON 了

ollama gguf 版本的问题挺多的,看不到错误日志

I have the same problem here. I have tried using hf.co/bartowski/UI-TARS-7B-DPO-GGUF:Q8_0. The issue seems to be with the model indeed. The team said they took down the GGUF models because of quantization errors. Right now only the full size HF Safetensors models are working.

@zhoushaw
Copy link
Member

@st01cs @Etherdrake @supperdsj 目前不建议使用 UI-TARS ollama 版本(该问题后续会进行优化),可以参考 https://github.com/bytedance/UI-TARS 中的说明,目前建议直接使用 vllm 部署模型

Image

@st01cs
Copy link

st01cs commented Jan 26, 2025

@st01cs @Etherdrake @supperdsj 目前不建议使用 UI-TARS ollama 版本(该问题后续会进行优化),可以参考 https://github.com/bytedance/UI-TARS 中的说明,目前建议直接使用 vllm 部署模型

Image

@zhoushaw

已测试vllm不会出现500的错误,2个4090(24G)显卡可部署7B版本。

请问Android界面怎么使用?

@KabakaWilliam
Copy link

KabakaWilliam commented Jan 26, 2025

@st01cs @Etherdrake @supperdsjIt is not recommended to use the UI-TARS ollama version at present (this problem will be optimized later). You can refer to the instructions in https://github.com/bytedance/UI-TARS . It is currently recommended to directly use vllm to deploy the model.

Image

I'm using VLLM to deploy the model (bytedance-research/UI-TARS-7B-DPO), but i still get the following error:

404 status code (no body)
Error: 404 status code (no body)
    at e.generate (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3090188)
    at f.makeStatusError (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3078640)
    at f.makeRequest (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3079756)
    at async F (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4244091)
    at async __commonJS.../midscene/dist/lib/chunk-CERQVVPJ.js.e.vlmPlanning (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4261835)
    at async Object.executor (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4320105)
    at async __commonJS.../midscene/dist/lib/index.js.e.Executor.flush (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4271269)
    at async G.actionToGoal (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4321606)
    at async Y.aiAction (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4326926)
    at async Object.onClick (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:5609153)

The logs from VLLM show:

ERROR 01-26 15:33:42 serving_chat.py:114] Error with model object='error' message='The model `gpt-4o-2024-08-06` does not exist.' type='NotFoundError' param=None code=404

My browser config is:
OPENAI_API_KEY="sk-xxxxxxxxx"
OPENAI_BASE_URL="http://localhost:8000/v1"
MIDSCENE_USE_VLM_UI_TARS="1"
OPENAI_MODEL_NAME="ui-tars"

@yooooo00
Copy link

yooooo00 commented Jan 27, 2025

浏览器插件报 403 的问题找到了,需要启用跨域配置
OLLAMA_HOST="0.0.0.0" OLLAMA_ORIGINS="*" ollama serve

403 的问题解决了,返回新的错误:

Cannot read properties of undefined (reading 'thought') TypeError: Cannot read properties of undefined (reading 'thought') at Object.executor (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4320200) at async e.Executor.flush (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4271106) at async G.actionToGoal (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4321443) at async Y.aiAction (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4326763) at async Object.onClick (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:5609123)

你好,请问这个新的错误解决了吗? @st01cs

@yooooo00
Copy link

yooooo00 commented Jan 27, 2025

@st01cs @Etherdrake @supperdsjIt is not recommended to use the UI-TARS ollama version at present (this problem will be optimized later). You can refer to the instructions in https://github.com/bytedance/UI-TARS . It is currently recommended to directly use vllm to deploy the model.
Image

I'm using VLLM to deploy the model (bytedance-research/UI-TARS-7B-DPO), but i still get the following error:

404 status code (no body)
Error: 404 status code (no body)
    at e.generate (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3090188)
    at f.makeStatusError (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3078640)
    at f.makeRequest (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3079756)
    at async F (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4244091)
    at async __commonJS.../midscene/dist/lib/chunk-CERQVVPJ.js.e.vlmPlanning (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4261835)
    at async Object.executor (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4320105)
    at async __commonJS.../midscene/dist/lib/index.js.e.Executor.flush (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4271269)
    at async G.actionToGoal (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4321606)
    at async Y.aiAction (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4326926)
    at async Object.onClick (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:5609153)

The logs from VLLM show:

ERROR 01-26 15:33:42 serving_chat.py:114] Error with model object='error' message='The model `gpt-4o-2024-08-06` does not exist.' type='NotFoundError' param=None code=404

My browser config is: OPENAI_API_KEY="sk-xxxxxxxxx" OPENAI_BASE_URL="http://localhost:8000/v1" MIDSCENE_USE_VLM_UI_TARS="1" OPENAI_MODEL_NAME="ui-tars"

Looks like you didn't choose the right model. Try to config your env in MIDSCENE_MODEL_NAME="ui-tars" but not OPENAI_MODEL_NAME="ui-tars" @KabakaWilliam

@KabakaWilliam
Copy link

@st01cs @Etherdrake @supperdsjIt is not recommended to use the UI-TARS ollama version at present (this problem will be optimized later). You can refer to the instructions in https://github.com/bytedance/UI-TARS . It is currently recommended to directly use vllm to deploy the model.
Image

I'm using VLLM to deploy the model (bytedance-research/UI-TARS-7B-DPO), but i still get the following error:

404 status code (no body)
Error: 404 status code (no body)
    at e.generate (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3090188)
    at f.makeStatusError (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3078640)
    at f.makeRequest (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:3079756)
    at async F (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4244091)
    at async __commonJS.../midscene/dist/lib/chunk-CERQVVPJ.js.e.vlmPlanning (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4261835)
    at async Object.executor (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4320105)
    at async __commonJS.../midscene/dist/lib/index.js.e.Executor.flush (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4271269)
    at async G.actionToGoal (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4321606)
    at async Y.aiAction (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:4326926)
    at async Object.onClick (chrome-extension://gbldofcpkknbggpkmbdaefngejllnief/lib/popup.js:1:5609153)

The logs from VLLM show:

ERROR 01-26 15:33:42 serving_chat.py:114] Error with model object='error' message='The model `gpt-4o-2024-08-06` does not exist.' type='NotFoundError' param=None code=404

My browser config is: OPENAI_API_KEY="sk-xxxxxxxxx" OPENAI_BASE_URL="http://localhost:8000/v1" MIDSCENE_USE_VLM_UI_TARS="1" OPENAI_MODEL_NAME="ui-tars"

Looks like you didn't choose the right model. Try to config your env in MIDSCENE_MODEL_NAME="ui-tars" but not OPENAI_MODEL_NAME="ui-tars" @KabakaWilliam

Thanks, that got it to work.

@zhoushaw
Copy link
Member

@st01cs @Etherdrake @supperdsj 目前不建议使用 UI-TARS ollama 版本(该问题后续会进行优化),可以参考 https://github.com/bytedance/UI-TARS 中的说明,目前建议直接使用 vllm 部署模型
Image

@zhoushaw

已测试vllm不会出现500的错误,2个4090(24G)显卡可部署7B版本。

请问Android界面怎么使用?

Android 的功能现在是使用 appium 实现的,https://github.com/web-infra-dev/midscene/tree/main/packages/web-integration/src/appium 虽然现在没有编写文档,但是大部分功能是可用的,你可以尝试跑一下 https://github.com/web-infra-dev/midscene/tree/main/packages/web-integration/tests/ai/native/appium

如果你有想法也可以给我们贡献

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants