Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug DMLFusedNode_0_0 on second token in 0.5.2 (DML) (Wrong tensor shape) #1112

Open
elephantpanda opened this issue Dec 3, 2024 · 1 comment
Labels

Comments

@elephantpanda
Copy link

elephantpanda commented Dec 3, 2024

I updated to 0.5.2 DirectML mode. Quadro P5000 GPU. Windows. C# DirectML 1.15.4. Model: microsoft/Phi-3-mini-4k-instruct-onnx

I get the following bug. (Bug only comes up in DML mode not CPU mode. It worked in version 0.4.0).

For any prompt e.g.
"<|user|>Hello <|end|><|assistant|>"
(tokens:)
1,32010,15043,29871,32007,32001

It outputs one token but then when trying to output the second token it errors out with:

OnnxRuntimeGenAIException: Non-zero status code returned while running DmlFusedNode_0_0 node. Name:'DmlFusedNode_0_0' Status Message: D:\a\_work\1\s\onnxruntime\core\framework\execution_frame.cc:173 onnxruntime::IExecutionFrame::GetOrCreateNodeOutputMLValue shape && tensor.Shape() == *shape was false. OrtValue shape verification failed. Current shape:{1,32,7,96} Requested shape:{1,32,2048,96}

Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess (System.IntPtr nativeResult) (at D:/a/_work/1/onnxruntime-genai/src/csharp/Result.cs:25)
Microsoft.ML.OnnxRuntimeGenAI.Generator.ComputeLogits () (at D:/a/_work/1/onnxruntime-genai/src/csharp/Generator.cs:25)
Main.Generate () (at Assets/Main.cs:202)

The relevant part seems to be: Current shape:{1,32,7,96} Requested shape:{1,32,2048,96}

It appears to be not padding the tokens to the max_tokens or something.

@elephantpanda elephantpanda changed the title Bug in 0.5.2 Bug DMLFusedNode_0_0 on second token in 0.5.2 (DML) Dec 3, 2024
@elephantpanda elephantpanda changed the title Bug DMLFusedNode_0_0 on second token in 0.5.2 (DML) Bug DMLFusedNode_0_0 on second token in 0.5.2 (DML) (Wrong tensor shape) Dec 3, 2024
@elephantpanda
Copy link
Author

elephantpanda commented Dec 3, 2024

I have a "work around" for this bug by commenting out this line in my code:

generatorParams.SetSearchOption("past_present_share_buffer", false);

Which I got from your code sample here.

I don't know if that is important or not. Perhaps it could fail more gracefully with a helpful comment(??)

BTW 0.5.2 seems to be about 2x as slow as 0.4.0.

It also crashes if your initial input IDs is a sequence of 360 zeros.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant