I read your paper with great interest and I understand that you used GRU to eliminate the need for MatMul operations. However, I couldn't quite grasp how it enabled parallel computations. Could you please provide a more detailed explanation?
thanks in advanced.