This project analyzes linguistic differences in Romanian articles written in Romania and the diaspora (Germany, Spain, Italy, UK). It uses Natural Language Processing (NLP) techniques to explore how Romanian is influenced by regional contexts.
- Compare vocabulary used in different regions.
- Identify regional influences on Romanian.
- Build NLP models to classify articles by region.
- Data: JSON files containing articles from Germany, Spain, Italy, UK, and Romania.
- Analysis: Python scripts in Google Colab for text preprocessing, analysis, and visualization.
- Results: Reports and charts showing linguistic differences.