Skip to content

Commit

Permalink
Update index.html
Browse files Browse the repository at this point in the history
  • Loading branch information
Espere-1119-Song authored Dec 5, 2023
1 parent 9f09ec3 commit a1ca9a4
Showing 1 changed file with 11 additions and 2 deletions.
13 changes: 11 additions & 2 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -105,8 +105,17 @@

<h3 style="text-align: center;">Abstract</h3>
<p style="text-align: justify; display: flex; justify-content: center; max-width: 800px; margin:auto; font-family:Computer Modern Roman; font-size: larger;">
Recently, integrating video foundation models and large language models to build a video understanding system overcoming the limitations of specific pre-defined vision tasks. Yet, existing systems can only handle videos with very few frames. For long videos, the computation complexity, memory cost, and long-term temporal connection are the remaining challenges. Inspired by Atkinson-Shiffrin memory model, we develop an memory mechanism including a rapidly updated short-term memory and a compact thus sustained long-term memory. We employ tokens in Transformers as the carriers of memory. MovieChat achieves state-of-the-art performace in long video understanding.
</p>
Recently, integrating video foundation models and large language models to build a video understanding system can overcome the limitations of specific pre-defined vision
tasks. Yet, existing systems can only handle videos with very
few frames. For long videos, the computation complexity,
memory cost, and long-term temporal connection impose
additional challenges. Taking advantage of the AtkinsonShiffrin memory model, with tokens in Transformers being
employed as the carriers of memory in combination with
our specially designed memory mechanism, we propose
the MovieChat to overcome these challenges. MovieChat
achieves state-of-the-art performance in long video understanding, along with the released MovieChat-1K benchmark with 1K long video and 14K manual annotations for
validation of the effectiveness of our method.
</p>

<div style="text-align: center;">
<img src="assets/wave.gif" style="width: 800px">
Expand Down

0 comments on commit a1ca9a4

Please sign in to comment.