Skip to content

wtl666wtl/WOT-Class

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WOT-Class: Weakly Supervised Open-world Text Classification

This repo contains the data and the code for the paper.

Overview

Open-world Classification

Weakly supervised text classification methods reduced the required human supervision. However, all these methods require that the human-provided known classes cover all the classes of interest, which can be very difficult, especially in the dynamic and ever-changing real world. In this work, we work on a novel yet important problem of weakly supervised open-world text classification, where supervision is only needed for a few examples from a few known classes and the machine should handle both known and unknown classes in test time.

1

WOT-Class

We propose a novel framework WOT-Class which follows an iterative process of (a) clustering text to new classes, (b) mining and ranking indicative words for each class, and (c) merging redundant classes by using the overlapped indicative words as a bridge. Our extensive experiments on 7 popular text classification datasets have shown the strong performance of WOT-Class. WOT-Class gains a 23.33% greater average absolute macro-F1 over the current best method across all datasets.

1

Requirements

The python version we used is 3.7, also, you need to install torch. The other requirements are listed in requirements.txt.

Data & Methods

The datasets can be accessed at data/.

The methods are implemented in src/.

To train and evaluate our method on a dataset (e.g., AGNews), you may run

cd src
sh run.sh {dataset_name} {gpu_id}

You can also modify run.sh to control the hyperparameters we support. The performances on the datasets, and their behaviors when using different hyperparameters, can be found in our paper.

Citation

If you find this repo useful, please cite our paper:

@article{Wang2023WOTClassWS,
  title={WOT-Class: Weakly Supervised Open-world Text Classification},
  author={Tianle Wang and Zihan Wang and Weitang Liu and Jingbo Shang},
  journal={ArXiv},
  year={2023},
  volume={abs/2305.12401}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published