Update index.html

rese1f · Dec 5, 2023 · a1ca9a4 · a1ca9a4
1 parent 9f09ec3
commit a1ca9a4
Showing 1 changed file with 11 additions and 2 deletions.
diff --git a/index.html b/index.html
@@ -105,8 +105,17 @@
 
 	  <h3 style="text-align: center;">Abstract</h3>
 	  <p style="text-align: justify; display: flex; justify-content: center; max-width: 800px; margin:auto; font-family:Computer Modern Roman; font-size: larger;"> 
-		Recently, integrating video foundation models and large language models to build a video understanding system overcoming the limitations of specific pre-defined vision tasks. Yet, existing systems can only handle videos with very few frames. For long videos, the computation complexity, memory cost, and long-term temporal connection are the remaining challenges. Inspired by Atkinson-Shiffrin memory model, we develop an memory mechanism including a rapidly updated short-term memory and a compact thus sustained long-term memory. We employ tokens in Transformers as the carriers of memory. MovieChat achieves state-of-the-art performace in long video understanding.
-	  </p>
+		Recently, integrating video foundation models and large language models to build a video understanding system can overcome the limitations of specific pre-defined vision
+tasks. Yet, existing systems can only handle videos with very
+few frames. For long videos, the computation complexity,
+memory cost, and long-term temporal connection impose
+additional challenges. Taking advantage of the AtkinsonShiffrin memory model, with tokens in Transformers being
+employed as the carriers of memory in combination with
+our specially designed memory mechanism, we propose
+the MovieChat to overcome these challenges. MovieChat
+achieves state-of-the-art performance in long video understanding, along with the released MovieChat-1K benchmark with 1K long video and 14K manual annotations for
+validation of the effectiveness of our method.
+	</p>
 
 	  <div style="text-align: center;">
 		<img src="assets/wave.gif" style="width: 800px">