YouTube, one of the largest online video-sharing platforms today, has provided a place for content creators to share information and earn extra income. Anticipating whether a video will be engaged by viewers or not is an essential factor in helping video creators improve video content and quality before publishing. To facilitate this task, we build an annotated dataset of 23,738 videos collected from 72 YouTube channels in Vietnam that were in four categories (i.e., comedy, travel-and-events, education, science-and-technology) and published over 12 years. We evaluate a number of metrics for measuring video engagement to propose a novel measure which determines the engagement of a video via its: Q score. Using our proposed measure, we annotate videos with three levels of engagement including: Engage, Neutral, and not Engage. From the supervised dataset, we constructed a multimodal to infer the degree of engagement based on the content of a YouTube video such as title, audio, thumbnail, video, and tags. We believe our dataset and metric to be useful for engagement analytic as well as studies on social media content.
| Column | Description |
|---|---|
| channel_id | Id of channel |
| channel_name | Name of channel |
| channel_category | Category of channel |
| channel_started | Started year of channel |
| channel_rank | Rank of channel in most-subscribed Vietnamese channels |
| channel_subscribers | Number of subscribers of channel |
| id | Id of video |
| title | Title of video |
| length_title | Length title of video (tokens) |
| categories | Categories of video |
| description | Description of video |
| tags | Tags of video |
| num_tags | Number of tags of video |
| upload_date | Uploaded date of video |
| delta_upload_date | Distance between collected date and uploaded date (days) |
| duration | Duration of video (minutes) |
| view_count | Number of views of video |
| like_count | Number of likes of video |
| comment_count | Number of comments of video |
| dislike_count | Number of dislikes of video |
| like_per_view | Number of likes per views of video |
| comment_per_view | Number of comments per views of video |
| dislike_per_view | Number of dislikes per views of video |
| engagement_rate_1 | Total comments and likes per views of video |
| engagement_rate_2 | Total comments, likes, and dislikes per views of video |
| q_score | Q score of video |
| label_1 | Engagement level based on engagement_rate_1 of video |
| label_2 | Engagement level based on q_score of video |
| Column | Description |
|---|---|
| channel_id | 0.000000 |
| channel_name | 0.000000 |
| channel_category | 0.000000 |
| channel_started | 0.000000 |
| channel_rank | 0.000000 |
| channel_subscribers | 0.000000 |
| id | 0.000000 |
| title | 0.000000 |
| length_title | 0.000000 |
| categories | 0.000000 |
| description | 0.017609 |
| tags | 0.000000 |
| num_tags | 0.000000 |
| upload_date | 0.000000 |
| delta_upload_date | 0.000000 |
| duration | 0.000000 |
| view_count | 0.000000 |
| like_count | 0.000000 |
| comment_count | 0.000000 |
| dislike_count | 0.000000 |
| like_per_view | 0.000000 |
| comment_per_view | 0.000000 |
| dislike_per_view | 0.000000 |
| engagement_rate_1 | 0.000000 |
| engagement_rate_2 | 0.000000 |
| q_score | 0.000000 |
| label_1 | 0.000000 |
| label_2 | 0.000000 |
- sample/audio_by_year: folder contains audio by year
- sample/thumbnails_by_year: folder contains thumbnails by year
- sample/video_by_year: folder contains video by year
- sample/entube_final.parquet: files contains metadata
You can get data which is feature extraction at here.
- Data input includes 3 files: entube_embedding_train.pt, entube_embedding_val.pt, entube_embedding_test.pt
- Data in each file is a list with each item is a dictionary including keys:
'id': id of video on Youtube
'embedding_title':tensor which is feature extraction of title, has shape: (768,)
'embedding_tag':tensor which is feature extraction of tag, has shape: (768,)
'embedding_thumbnail':tensor which is feature extraction of thumbnail, has shape: (2560,)
'embedding_video':tensor which is feature extraction of the video, has shape: (2304,1,2,2)
'embedding_audio':tensor which is feature extraction of audio, has shape: (62, 128)
'engagement_rate_label':tensor of label 1 which not use q-score
'q_score_label':tensor of label 2 which use q-scoreYou can get data zip in here at here.
- Data input includes 3 folders: audio_short_zip, video_short_zip, thumbnails_zip