Project NLP MVA 23 📚

Welcome to the repoitory of the final project for the Algorithms for Speech and Natural Language Processing class.

Datasets 📊

This section contains various datasets utilized for training our models.

Code 👩‍💻

Outputs: Explore our project's code and outputs:

Final report

MVA_NLP_23.pdf

Pattern Exploitation Training (PET) 🧠

Abstract

Our project is inspired by "It’s not just size that matters" by Schick and Schütze (2020), introducing Pattern Exploitation Training (PET). PET reframes tasks as language modeling problems, using fine-tuned language models to label unlabeled data. This enables classical classifiers with small training datasets and supports few-shot learning. Our project explores PET, replicates its results on various datasets, and compares different masked language models.

Introduction

Language models like GPT-3 and GPT-4 excel in natural language processing but require substantial computational resources. PET offers an efficient alternative by transforming tasks into language modeling challenges. It uses predefined patterns and verbalizers to convert features into sentences and labels into words. PET facilitates few-shot learning and works well with small training datasets, leveraging easily accessible unlabeled data.

Project Scope and Findings

Our project replicates Schick and Schütze's study and compares masked language models. While some SuperGLUE tasks showed lower accuracies, sentiment classification tasks performed well. We faced challenges such as labeling errors for specific tasks. Additionally, we created new tasks and explored MLM performance within a few-shot learning paradigm.

Conclusion

In conclusion, our project provides insights into the effectiveness of Pattern Exploitation Training (PET) for natural language tasks. Despite limitations due to hardware and time constraints, our findings contribute to the discussion on efficient language model training and real-world applications.

Name		Name	Last commit message	Last commit date
Latest commit History 156 Commits
Datasets		Datasets
Final_results		Final_results
Outputs		Outputs
Speed_test		Speed_test
content/Project_NLP_MVA_23		content/Project_NLP_MVA_23
pet-master		pet-master
.DS_Store		.DS_Store
Allocinetest.ipynb		Allocinetest.ipynb
MVA_NLP_23.pdf		MVA_NLP_23.pdf
Mini_workin_collab.ipynb		Mini_workin_collab.ipynb
Not_just_size_that_matter.pdf		Not_just_size_that_matter.pdf
README.md		README.md
creating_unlabbeled_COPA.ipynb		creating_unlabbeled_COPA.ipynb
creating_unlabbeled_RTE.ipynb		creating_unlabbeled_RTE.ipynb
creating_unlabbeled_SST.ipynb		creating_unlabbeled_SST.ipynb
pet_paper.pdf		pet_paper.pdf
requirements.txt		requirements.txt
Ébauche sur colab.ipynb		Ébauche sur colab.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project NLP MVA 23 📚

Datasets 📊

Code 👩‍💻

Final report

Pattern Exploitation Training (PET) 🧠

Abstract

Introduction

Project Scope and Findings

Conclusion

Links 🔗

Article 📄

Google Colab 🚀

Dataset MNLI 📦

List of all used datasets 📋

Important YouTube Videos 🎥

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

MathiasVigouroux/Project_NLP_MVA_23

Folders and files

Latest commit

History

Repository files navigation

Project NLP MVA 23 📚

Datasets 📊

Code 👩‍💻

Final report

Pattern Exploitation Training (PET) 🧠

Abstract

Introduction

Project Scope and Findings

Conclusion

Links 🔗

Article 📄

Google Colab 🚀

Dataset MNLI 📦

List of all used datasets 📋

Important YouTube Videos 🎥

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages