Welcome to the Myanmar NLP Community! We are a dedicated, collaborative organization focused on advancing Natural Language Processing (NLP) and other language-related technologies for the Myanmar language (Burmese) and its various ethnic languages.
Our mission is to collect, curate, and foster contributions to a rich ecosystem of open-source projects, tools, and datasets that empower researchers, developers, and linguists working with Myanmar languages.
Centralize Resources: Create a single hub for discovering high-quality, open-source projects related to Myanmar language technology.
Encourage Collaboration: Provide a platform for developers and researchers to connect, share knowledge, and collaborate on shared challenges.
Build Foundational Tools: Support the development of essential resources, such as reliable corpora, pre-trained models, tokenizers, and phonetic transcription tools.
Promote Open Source: Advocate for and facilitate the creation and maintenance of open-source projects for public benefit.
📚 Projects and Contributions We collect contributions across a wide spectrum of Myanmar language projects. If you have a project, dataset, or tool, we would love to feature it!
Areas of Interest Include: Datasets & Corpora: Annotated texts, parallel corpora, spoken language datasets, etc.
Core NLP Tools: Tokenizers, segmenters, part-of-speech (POS) taggers, named entity recognition (NER), and parsers.
Machine Translation: Open-source translation models and aligned datasets for Myanmar ↔ other languages.
Speech Technology: Speech recognition (ASR) and text-to-speech (TTS) synthesis resources.
Typing & Script Conversion: Tools for handling Zawgyi/Unicode conversion, keyboard layouts, and font rendering.
Low-Resource Languages: Efforts to develop language technologies for ethnic languages within Myanmar.
Whether you're a seasoned NLP expert, a student, a linguist, or just someone passionate about the Myanmar language, there are many ways to contribute!
Do you maintain an existing project? We can link to it, feature it, or even host it here.
To submit: Open an Issue on this repository and select the 'Project Submission' template. Provide a brief description, a link to the repository, and the relevant language technologies.
Have a great idea but need a home for it? We can help you start a new repository under the Community umbrella and connect you with potential collaborators.
To propose: Open an Issue and select the 'New Project Proposal' template.
Check the repositories listed under our organization profile. You can contribute by:
Code: Fixing bugs, adding features, or improving documentation.
Data: Annotating, cleaning, or expanding existing datasets.
Documentation: Improving guides and tutorials to make tools more accessible.