Skip to content

Latest commit

 

History

History
18 lines (13 loc) · 824 Bytes

README.md

File metadata and controls

18 lines (13 loc) · 824 Bytes

README: Romanian in Different Regions

Description

This project analyzes linguistic differences in Romanian articles written in Romania and the diaspora (Germany, Spain, Italy, UK). It uses Natural Language Processing (NLP) techniques to explore how Romanian is influenced by regional contexts.

Goals

  1. Compare vocabulary used in different regions.
  2. Identify regional influences on Romanian.
  3. Build NLP models to classify articles by region.

Project Structure

  • Data: JSON files containing articles from Germany, Spain, Italy, UK, and Romania.
  • Analysis: Python scripts in Google Colab for text preprocessing, analysis, and visualization.
  • Results: Reports and charts showing linguistic differences.