This repo contains a flexible, easy to understand and modifiable foundation for scraping, cleaning, analyzing and visualizing publicly available education data of Turkey.
Code was written in Python 3.8.5
Along with the foundation, different studies can be done about high schools and universities in Turkey. We conducted a study on the efficiency of public high schools in Turkey, as detailed in the following papers:
Analysis of Public High Schools Admitting Students by Examination in Turkey, Beyza Arslan, Anıl Şen.
This study has received second prize in the Education Category from The Scientific and Technological Research Council of Turkey.
If you use this code or data in your research, please cite us using this BibTeX entry:
@article{arslan2021analysis,
title={Analysis of Public High Schools Admitting Students by Examination in Turkey},
author={Arslan, Beyza and {\c{S}}en, An{\i}l},
journal={Available at SSRN 3939070},
year={2021}
}
If you're interested in extending this work, have an idea or any questions:
- email us asen16@ku.edu.tr or barslan16@ku.edu.tr
or submit an issue.
To get started, you'll need to have Python 3.8+ installed.
- Clone this repository to your local machine:
git clone https://github.com/asen16/high-schools-analysis.git
- Change directory to the directory where requirements.txt is located.
cd [Path]
- Run:
pip install -r requirements.txt
in your shell.
Installation of Web Driver: You'll need to install the Web Driver to scrape data. You can follow the necessary installation steps from Selenium's documentation.
Selenium:
You can follow the instructions in these files to work on the analysis or data scraping part. Google colab version of the codes will be uploaded in next days.
analysis: Shows how master data is filtered, analyzed and data visualized.data_scraping: Explains how raw data is scraped from its source.
The code repository is organized into the following components:
- The datasets are located in the
analysis/datafolder. - The graphs are located in the
analysis/graphsfolder. - The tabels are located in the
analysis/tablefolder. - The visualization and analysis programs are located in the
analysisfolder. - The raw data are located in the
data_scraping/DATAfolder. - The data scraping program is located in the
data_scrapingfolder.
- Please let us know if you encounter any bugs by filing a Github issue.
- We appreciate all your contributions. If you plan to contribute a new Method, Data, or anything else, please see our contribution guidelines.
For the complete release history, see CHANGELOG.md.
Analysis of Public High Schools Admitting Students by Examination in Turkey is released under the MIT License.