-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the paperviz wiki!
Flow:
-
Scrape a conference with the following info and save to json.
- 'id'
- 'link'
- 'title'
- 'authors'
- 'abstract'
-
Download the pdfs for the respective conferences using the links in the json.
-
Run Image Extraction on the downloaded pdfs to generate images for each paper in the conference.
-
Generate compressed resized version of the images which will be used as thumbnail.
-
Upload the extarcted large images to drive and also update the 'img_large' value in the conference json for as the 'file_id' of each uploaded img.
-
Upload the small resized version of images to github repo and also update the 'img_small' value in the conference json.
-
Generate 2D Embeddings using the abstract information for each paper to create the visualizations with the following methods:
- Specter
- SciBert
- SentBert
-
Push the updated conference json containing 2D embeddings to github repo and add the conference to the conf list on the website.