BEP Scrapper

Introduction

A script to scrape job offers from BEP's offer page. This script uses Selenium to access the webpage, BeautifulSoup to parse the webpage's data and an SQLite3 database to save the data. The script's RESTful API and front-end use Flask.

Requirements

Python packages

Execute the following command to install the required python packages (I recommend installation in a python virtual environment):

pip3 install -r requirements.txt

Geckodriver

You also need the Geckodriver executable in your PATH or the script's working directory to run the scrapper.

Executing

Running the scrapper

python3 scrapper_selenium.py

Running the front-end and the REST API

python3 rest.py

The API and the front-end are set to listen in port 5002. The front-end search through category feature is incomplete and may lag some browsers while performing a search.

Access the front-end

Open your browser and type the following in the address bar: http://127.0.0.1:5002/index.html

Available API methods

Search Type	API method
List all the offers in the DB	`/jobs`
Search offers through types	`/search/type/search_query`
Search offers through contract type	`/search/contract/search_query`
Search offers through career	`/search/career/search_query`
Search offers through category	`/search/category/search_query`
Search offers through district	`/search/district/search_query`
Search offers through organization	`/search/org/search_query`
Search offers through academic skills	`/search/skills/search_query`
Search offers through expire date	`/search/expire/search_query`

Name	Name	Last commit message	Last commit date
Latest commit Thekings2468 Update README.md Jul 1, 2018 af5864e · Jul 1, 2018 History 16 Commits
static	static	Changes in the frontend	Jun 27, 2018
.gitattributes	.gitattributes	Fixed .gitattributes	Jun 27, 2018
.gitignore	.gitignore	Initial Commit	Jun 22, 2018
README.md	README.md	Update README.md	Jul 1, 2018
offline_scrapper.py	offline_scrapper.py	Initial Commit	Jun 22, 2018
offlinetest.html	offlinetest.html	Initial Commit	Jun 22, 2018
requirements.txt	requirements.txt	Preparations to implement REST API	Jun 26, 2018
rest.py	rest.py	Added a frontend for the REST API	Jun 27, 2018
scrapper_selenium.py	scrapper_selenium.py	Added spacing between multi district entries	Jun 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BEP Scrapper

Introduction

Requirements

Python packages

Geckodriver

Executing

Running the scrapper

Running the front-end and the REST API

Access the front-end

Available API methods

About

Releases

Packages

Languages

Thekings2468/pyScrapper-BEP

Folders and files

Latest commit

History

Repository files navigation

BEP Scrapper

Introduction

Requirements

Python packages

Geckodriver

Executing

Running the scrapper

Running the front-end and the REST API

Access the front-end

Available API methods

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages