-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inference time #6
Comments
Hi! My explanation is that tensor decomposition methods require more mathematical operations: instead of one (highly optimized in Pytorch) matrix multiplication, we have several. I think it is possible to optimize our code of Tensor Train and Tucker methods and make it faster, but it is not obvious how to do it more efficiently. |
It could be implemented it in one operation with einsum, however pytorch does not fully support broadcasting for einsum. (it did worked for me in numpy though). However, I assume that torch.einsum calls many matmul operations "behind the surface" (like it does in tensorflow) so it won't be much better. I also thought about implementing it as a numba kernel (however found that numba does not support einsum too). |
thanks @saareliad Can einsum accelerate the many matmul operations produced by tt/tucker decomposition?
And how to implement with einsum in one operation? |
Most of einsum code runs in C++ so it should be faster. I didn't check extensively. I compared memory consumption vs using a python loop with tensordots (tt-pytorch implementation) and einsum is better. Can't publish the full code yet because its under active research. We changed the TT implementation quite a lot from the public github repos and used 4-dimensional tensors as tt.cores. something like I hope that when the research is done it will be published as part of a paper or integrated to |
hi, I am in a puzzle about the inference time of the compressed model. Why is the compressed model more time consuming? Shouldn't it be faster with fewer parameters(about half of the orignal) ?
thx
The text was updated successfully, but these errors were encountered: