This is my 3rd place solution of Jigsaw Rate Severity of Toxic Comments Kaggle competition.
Specific writeup is here.
Self-consistent notebook of final submission is here
In this repository you can find some experiments I did during the competition.
all notebooks run on Kaggle/docker-python with additional packages:
pip install detoxify
pip install transformers
pip install sentence_transformers
pip install iterative-stratification
Kaggle datasets to download in input directory
kaggle competitions download -c jigsaw-toxic-severity-rating
kaggle datasets download -d rajkumarl/ruddit-jigsaw-dataset
kaggle datasets download -d julian3833/jigsaw-unintended-bias-in-toxicity-classification
kaggle datasets download -d julian3833/jigsaw-toxic-comment-classification-challenge