Image Source: Jorge Franganillo from Unsplash
This program is a fake news detector that utilizes machine learning to analyze and classify news articles that may be either real or fake. This is achieved by analyzing the results of various machine learning models after they process input data taken from data tables. The machine learning models used within this program include Naive Bayes, Logistic Regression, Random Forest, and Support Vector Machine.
To support the models, TfidfVectorizer, KMeans, and RandomForestClassifier are used to present the information in a usable way for some models and for others to highlight the key features the models should consider as important features. It also uses a simple function for preprocessing the plain text data of the news articles in order to filter out unnecessary characters and to make sure the text will be accepted by the models during the training and testing phases. The two relevant files are named "fake_news.csv" and "true_news.csv".
Tip
If you wish to see the results of the Jupyter Notebook without downloading the repository to run the code, click on the file called fake-news-detector.ipynb, which is located in the GitHub file tree above.
This section details how you can get started with using the fake news detector to run it for yourself. However, there are a few required dependencies you will need in order to run the code. You will also need to download the code from Github as well as install Python, Jupyter Notebook, and the required libraries the code utilizes onto your system.
- Understanding of Basic Terminal Commands
- Installing Python, Jupyter Notebook, & Required Libraries
- Downloading the GitHub Repository
Tip
If you are new to coding, or do not know how to use your system's terminal, please refer to the following resources:
-
To install Python & Jupyter Notebook, follow the directions on their website based on your system's type, which can be found here:
Additionally, the following are the main required libraries in order for the code to run. In order to install, Please type the following commands into your terminal globally prior to running the code:
pip install pandas
pip install matplotlib
pip install scikit-learn
-
To clone the repository from GitHub, use your system's terminal to navigate to the location where you want this project to be located using the
cdcommand in your terminal:cd ./your-desired-path-to-folder-here/
Once there, run the following command:
git clone https://github.com/ila-w/fake-news-detection.git
Note
Your user navigation information will vary depending on your system and where you saved the folder when cloning it from GitHub.
Users/
└── username/
└── Desktop/
└── fake-news-detection/
Important
Before going any further, verify that the "true-news.csv" and "fake-news.csv" files are inside the datasets/ folder or the code will be unable to run. You can check this by navigating to the folder using your system's terminal or folder navigation program (Finder or File Explorer).
fake-news-detection/
└── datasets/
├── fake-news.csv
├── true-news.csv
├── fake-news-detector.ipynb
-
In your system's terminal (Command Prompt, Powershell, Terminal), navigate to the "fake-news-detection" folder:
cd ./path-to-folder-here/fake-news-detection/
-
Open the program in Jupyter Notebook by running the command:
jupyter noteboook
-
Once Jupyter Notebook is running, click on "fake-news-detector.ipynb" to open the program file.
fake-news-detection/ ├── fake-news-detector.ipynb
-
At the top of the page you will see a button labeled "Run". Click the button and then select the "Run Selected Cell and All Below" option. Your program should now be running!
Note
If there are any additional errors, please check the console to identify the libraries your system is still
missing. Once found, use pip install [library name here] to finish installing the required libraries.
Kuntur, S., Wróblewska, A., Paprzycki, M., & Ganzha, M. (2024). "Fake News Detection: It's All in the Data!" Cornell University arXiv, pp. 1-12. https://www.doi.org/10.48550/arXiv.2407.02122.
- Github repo: https://github.com/fakenewsresearch/
- Kaggle dataset: https://www.kaggle.com/datasets/csmalarkodi/isot-fake-news-dataset
Lazer, D. M.J., Baum, M. A., Benkler, Y., Berinsky, A. J., Greenhill, K. M., Menczer, F., Metzger, M. J., Nyhan, B., Pennycook, G., Rothschild, D., Schudson, M., Sloman, S. A., Sunstein, C. R., Thorson, E. A., Watts, D. J., & Zittrain, J. L. (2018). "The science of fake news." Science, 359(6380), pp. 1094–1096. https://doi.org/10.1126/science.aao2998.
Hussain, F. G., Wasim, M., Hameed, S., Rehman, A., Asim, M. N., & Dengel, A. (2025). "Fake News Detection Landscape: Datasets, Data Modalities, AI Approaches, Their Challenges, and Future Perspectives." IEEE, 13, pp. 54757-54778. https://www.doi.org/10.1109/ACCESS.2025.3553909.
Ila Wallace
Contributions from Ismael Suarez, Maddie Myer, Christian Flores, and Brenden L'Heureux
