Skip to content

TornadoTaskRuntimeException when using Phi-3-mini-4k-instruct-fp16.gguf during TornadoVM initialization #72

@yrq0208

Description

@yrq0208

Describe the bug
Tokenizer: Phi3Tokenizer
Loading model weights in TornadoVM format (loading F16)

Starting TornadoVM initialization...
TornadoVM GPU execution plan creation: 619.22 ms
Java to GPU JIT compiler warmup: 6147.02 ms
Exception in thread "main" uk.ac.manchester.tornado.api.exceptions.TornadoTaskRuntimeException: Parameter #4 uk.ac.manchester.tornado.api.types.arrays.HalfFloatArray@ebaa6cb from task not specified either in transferToDevice or transferToHost functions
at [email protected]/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.checkAllArgumentsPerTask(TornadoTaskGraph.java:1516)
at [email protected]/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.execute(TornadoTaskGraph.java:1599)
at [email protected]/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.execute(TornadoTaskGraph.java:1626)
at [email protected]/uk.ac.manchester.tornado.api.TaskGraph.execute(TaskGraph.java:804)
at [email protected]/uk.ac.manchester.tornado.api.ImmutableTaskGraph.execute(ImmutableTaskGraph.java:50)
at [email protected]/uk.ac.manchester.tornado.api.TornadoExecutor.lambda$execute$0(TornadoExecutor.java:49)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
at [email protected]/uk.ac.manchester.tornado.api.TornadoExecutor.execute(TornadoExecutor.java:49)
at [email protected]/uk.ac.manchester.tornado.api.TornadoExecutionPlan.execute(TornadoExecutionPlan.java:181)
at org.beehive.gpullama3.tornadovm.TornadoVMMasterPlan.forceCopyInReadOnlyDataLayered(TornadoVMMasterPlan.java:190)
at org.beehive.gpullama3.tornadovm.TornadoVMMasterPlan.initializeTornadoVMPlan(TornadoVMMasterPlan.java:67)
at org.beehive.gpullama3.model.Model.runInstructOnce(Model.java:205)
at org.beehive.gpullama3.LlamaApp.runSingleInstruction(LlamaApp.java:18)
at org.beehive.gpullama3.LlamaApp.main(LlamaApp.java:44)
Error: Command failed with return code 1

To Reproduce
./llama-tornado --gpu --verbose-init --opencl --model Phi-3-mini-4k-instruct-fp16.gguf --prompt "tell me a joke"

Expected behavior
Initialize successfully and tell me a joke.

Desktop (please complete the following information):

  • OS: Ubuntu
  • Version: 24.04.3 LTS

Additional context
Using the latest build with TornadoVM and GPULlama3.java. Other models such as beehive-llama-3.2-1b-instruct-fp16.gguf, beehive-llama-3.2-3b-instruct-fp16.gguf, DeepSeek-R1-Distill-Qwen-1.5B-F16.gguf, Qwen2.5-0.5B-Instruct-f16.gguf, qwen2.5-1.5b-instruct-fp16.gguf, Qwen3-0.6B-f16.gguf all works with me.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions