Thanks you for bringing the incredible job!
I wonder why the speed on qwen2.5-vl is nearly the same as the original model. I test it on llama-factory dataset format by pytorch. Could you please provider the infer script or more detailed instrcutions.
Thanks you for bringing the incredible job!
I wonder why the speed on qwen2.5-vl is nearly the same as the original model. I test it on llama-factory dataset format by pytorch. Could you please provider the infer script or more detailed instrcutions.