Does LMdeploy support latest version of flash attention, page attention? #480

ywangwxd · 2023-09-26T07:30:19Z

ywangwxd
Sep 26, 2023

They are new techniques for optimizing inference performance of LLM.

lvhan028 · 2023-09-26T07:47:15Z

flash attention 2 is already integrated.
page attention is in impelmentation

1 reply

Great! Looking forward to the new release.
Btw, It will be better to introduce what features have been included in the release notice.