ParaLS: Paraphraser-based Lexical Substitution

Lexical substitution (LS) aims at finding appropriate substitutes for a target word in a sentence. Recently, LS methods based on pretrained language models have made remarkable progress, generating potential substitutes for a target word through analysis of its contextual surroundings. However, these methods tend to overlook the preservation of the sentence's meaning when generating the substitutes. This study explores how to generate the substitute candidates from a paraphraser, as the generated paraphrases from a paraphraser contain variations in word choice and preserve the sentence's meaning. Since we cannot directly generate the substitutes via commonly used decoding strategies, we propose two simple decoding strategies that focus on the variations of the target word during decoding. Experimental results show that our methods outperform state-of-the-art LS methods based on pre-trained language models on three benchmarks.

Requirements and Installation

Our code is based on Fairseq version=10.2
PyTorch version = 1.7.1
Python version >= 3.7
For training new models, you'll also need an NVIDIA GPU and NCCL

Step 1: Downlaod the pretrained paraphraser modeling

You need to download the paraphraser from here, and put it into folder "checkpoints/⁨para⁩/transformer/⁩"

Step 2: Run our code

(1) run ParaLS for lexical substitute dataset LS07

input "run_LS_Paraphraser.multi.ls07.sh"

(2)run ParaLS for lexical substitute dataset LS14

input "run_LS_Paraphraser.multi.ls14.sh"

Citation

Please cite as:

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.circleci		.circleci
bert_score		bert_score
bleurt-large-512		bleurt-large-512
checkpoints/para		checkpoints/para
data		data
docs		docs
examples		examples
fairseq		fairseq
gpt2_bpe_test		gpt2_bpe_test
metrics		metrics
results		results
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
LSPara.multi.ls07.bart.py		LSPara.multi.ls07.bart.py
LSPara.multi.ls07.py		LSPara.multi.ls07.py
LSPara.multi.ls14.bart.py		LSPara.multi.ls14.bart.py
LSPara.multi.ls14.py		LSPara.multi.ls14.py
README.md		README.md
bart_score.py		bart_score.py
cal_score.py		cal_score.py
gap_LS07.bart.py		gap_LS07.bart.py
gap_LS07.py		gap_LS07.py
gap_LS14.bart.py		gap_LS14.bart.py
gap_LS14.py		gap_LS14.py
generate.py		generate.py
hubconf.py		hubconf.py
interactive.py		interactive.py
run_LS_Paraphraser.multi.ls07.bart.sh		run_LS_Paraphraser.multi.ls07.bart.sh
run_LS_Paraphraser.multi.ls07.sh		run_LS_Paraphraser.multi.ls07.sh
run_LS_Paraphraser.multi.ls14.bart.sh		run_LS_Paraphraser.multi.ls14.bart.sh
run_LS_Paraphraser.multi.ls14.sh		run_LS_Paraphraser.multi.ls14.sh
setup.cfg		setup.cfg
setup.py		setup.py
sh_evaluate.sh		sh_evaluate.sh
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ParaLS: Paraphraser-based Lexical Substitution

Requirements and Installation

Step 1: Downlaod the pretrained paraphraser modeling

Step 2: Run our code

Citation

About

Releases

Packages

Languages

License

KpKqwq/ParaLS

Folders and files

Latest commit

History

Repository files navigation

ParaLS: Paraphraser-based Lexical Substitution

Requirements and Installation

Step 1: Downlaod the pretrained paraphraser modeling

Step 2: Run our code

Citation

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages