Dayviews-Scraper

A scraper that downloads all the posts from a non-password-protected dayviews (formerly known as bilddagboken) account.

Dependencies

There's only one major dependency for this script: [PhantomJS][http://phantomjs.org/]. Make sure to install it, otherwise nothing's going to run.

Installation

Installing the Dayviews-Scraper can be super easy, or super hard - it all depends on your experience with the command line and basic sys-admin work. Here are the steps:

Install PhantomJS. On windows, this means that you need to download the latest executable from phantomjs.org and adding it to your path variable. On mac, this is easiest done by simply running the following command from your terminal:

brew update && brew install phantomjs
Clone this repo.
Run the script like I tell you below in "Usage".

Usage

Run "phantomjs scrape.js http://dayviews.com/username/firstImageId/" from your terminal. Make sure that you're standing in the right folder.

Arguments

First argument (required): URL to entrypoint.

Second argument (optional): Offset number

Known issues

The script gives a typeerror on some pageloads when evaluating a third party ad script.
The script sometimes mistake a facebook (or casumo) js-url for the actual image we want. No worries - the script will try again.
Special characters (Å, Ä, Ö) is not yet supported. This is prioritized.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Dayviews-Scraper

Dependencies

Installation

Usage

Arguments

Known issues

Files

README.md

Latest commit

History

README.md

File metadata and controls

Dayviews-Scraper

Dependencies

Installation

Usage

Arguments

Known issues