LM-Property-Inheritance

This repository contains code for the paper "Characterizing the Role of Similarity in the Property Inferences of Language Models" by Juan Diego Rodriguez, Aaron Mueller, and Kanishka Misra.

Packages

Use Python >= 3.9. To install dependencies, pip install the following libraries:

transformers
semantic-memory
minicons
pyvene

Data

We rely on the THINGS database, released by Hebart et al., and can be found here -- we used the "THINGSplus" collection as well as the "THINGS Similarity data"; the downloaded data are as follows:

1. data/things/category53_longFormat.tsv
A tsv of category-word pairs (this is the initial set of superordinate categories).

2. data/things/spose_embedding_66d_sorted.txt
the embedding matrix of all unique concepts in the THINGS database

3. data/things/unique_id.txt
rownames of the embedding matrix, for reference

We also generated some manipulated versions of the above data:


1. data/things/THINGS hypernyms - Sheet1.csv
A hand-annotated version of the category file, with wordnet senses and annotations to discard certain items.

2. data/things/things-lemmas-annotated
A hand-annotated version of our lexicon involving all the objects and categories, with entries to denote usage of the entities with articles ("a bat"), their number inflection (bat vs bats), and the default surface form when used in our stimuli (e.g., using "birds" for "bird").

3. data/things/things-triples
A hand-annotated version of the object category pairs with wordnet senses.

4. data/things/things-triples-actual.csv
Taxonomic triples generated from the THINGS data, by running:

`python src/pilot.py --triples_path data/things/things-triples --lemma_path data/things/things-lemmas-annotated --save`

We used two different similarity measures: word sense, and SPOSE. For the former, we used the lmms-albert-xxl-v2, which can be found here. For the former we used the data/things/spose_embedding_66d_sorted.txt matrix.

To generate negative samples, run:

bash scripts/negative-sampling.sh

This will create a bunch of more data:

1. Stimuli Pairs

Stored in data/things/stimuli-pairs:

1. things-inheritance-sense_based_sim-pairs.csv
Stimuli pairs with the sense based method used for negative sampling. Annotations for hypernymy relations, similarity bins, raw similarity.

2. things-inheritance-SPOSE_prototype_sim-pairs.csv
Stimuli pairs with the SPOSE embeddings used for negative sampling. Annotations for hypernymy relations, similarity bins, raw similarity.

2. Pairwise Similarity Data

Stored in data/things/similarity

1. things-sense_based.csv
Pairwise similarities based on the Word Sense similarity method.

2. things-SPOSE_prototype.csv
Pairwise similarities based on the SPOSE embedding method.

Behavioral Experiments

Given the above data, we are now ready to run behavioral experiments, which can be reproduced by running:

bash scripts/things-experiments.sh

To see the specific arguments for the python scripts and their descriptions, refer to src/behavioral_eval.py. This script saves results in data/things/results. The following are some more details:

Taxonomic Sensitivity results:

1. things-sense_based-ns/*
2. things-SPOSE_prototype-ns/*

Property Sensitivity results:

1. things-sense_based-ns_multi-property/*
2. things-SPOSE_prototype-ns_multi-property/*

Mismatch Sensitivity results:

1. things-sense_based-ns_multi-property_prop-contrast/*
2. things-SPOSE_prototype-ns_multi-property_prop-contrast/*

Analysis

Plots and results for the behavioral experiments can be generated by running/referring to analysis/behavioral-plots.R. This has to be run interactively.

Boundless DAS Experiments

To train boundless DAS interventions (e.g., for Mistral 7B Instruct v0.2):

bash train_mistral_das-balanced.sh

To evaluate the trained intervention on the test set:

bash eval_mistral_das-balanced.sh

Analysis

Plots and results for the DAS experiments can be generated by running/referring to analysis/das-results.R. This has to be run interactively.

Citation

If you use the code or data produced for this work, please cite us using the following bibtex entry:

@article{rodriguez-et-al-2024-characterizing,
    title = "{Characterizing the Role of Similarity in the Property Inferences of Language Models}",
    author = "Rodriguez, Juan Diego and Mueller, Aaron and Misra, Kanishka",
    journal = {Computing Research Repository},
    volume = {arXiv:2410.22590},
    year = "2024",
    url = "https://arxiv.org/abs/2410.22590"
}

License

We release our materials under an MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
analysis		analysis
data		data
paper		paper
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
acdc.py		acdc.py
bdas_requirements.txt		bdas_requirements.txt
causal_search.py		causal_search.py
circuit.py		circuit.py
concept-hierarchies.Rproj		concept-hierarchies.Rproj
eval_change.py		eval_change.py
icl_eval.py		icl_eval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LM-Property-Inheritance

Packages

Data

1. Stimuli Pairs

2. Pairwise Similarity Data

Behavioral Experiments

Analysis

Boundless DAS Experiments

Analysis

Citation

License

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

aaronmueller/lm-property-inheritance

Folders and files

Latest commit

History

Repository files navigation

LM-Property-Inheritance

Packages

Data

1. Stimuli Pairs

2. Pairwise Similarity Data

Behavioral Experiments

Analysis

Boundless DAS Experiments

Analysis

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages