Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactoring code for better vectorization #107

Open
wants to merge 537 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
537 commits
Select commit Hold shift + click to select a range
624aeb6
Generate torchtext data, fix bug in allowing null pages into
EntilZha Apr 26, 2018
121dc74
Fix S3 path
EntilZha Apr 26, 2018
6116a59
(wip) fixing evaluation pipeline
ihsgnef Apr 27, 2018
a14ba63
Merge branch 'master' of github.com:Pinafore/qb
ihsgnef Apr 27, 2018
3c10505
New dan/torchtext dataset code
EntilZha Apr 27, 2018
5be0934
Directory bug in script
EntilZha Apr 27, 2018
9d88554
Allocate more cpus for gpu jobs
EntilZha Apr 27, 2018
31323a6
fix breakage in end to end evaluation due to merge
ihsgnef Apr 27, 2018
43bf424
Merge branch 'master' into fs_dev
ihsgnef Apr 27, 2018
32d9d5d
Configs
EntilZha Apr 27, 2018
69ee0b0
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Apr 27, 2018
73c88d9
Fix guesser path report issue
EntilZha Apr 27, 2018
eb870b1
Fix crash on perf report
EntilZha Apr 28, 2018
c1db519
Improve dan defaults
EntilZha May 1, 2018
7adda05
New rnn code
EntilZha May 1, 2018
dc273fc
Update dataset code sorting
EntilZha May 1, 2018
e98ad27
Fix sorting stuff
EntilZha May 1, 2018
1a46024
Simple print fix
EntilZha May 1, 2018
76ef23f
Fix rnn lengths
EntilZha May 1, 2018
e38f811
Fix rnn code, add configs
EntilZha May 1, 2018
767f110
Fix and finish initial rnn implementation
EntilZha May 1, 2018
d2b7dd5
More granular choice over downloads
EntilZha May 3, 2018
6aeab65
Add mapping code
EntilZha May 3, 2018
fce8ad3
More annotated code
EntilZha May 3, 2018
12406fd
Remove six dependency
EntilZha May 3, 2018
e1bee49
Annotated mapping files in yaml format
EntilZha May 4, 2018
1b76a60
Add result as requirement
EntilZha May 4, 2018
e1851bc
Add annotation use into dataset pipeline
EntilZha May 4, 2018
892a2ee
Add code to preocess protobowl logs for player counts
EntilZha May 4, 2018
d3a30a6
Remove unused notebooks
EntilZha May 4, 2018
b4cacf3
Update dataset folds
EntilZha May 4, 2018
6129aa2
Fix small, but major bug in fold assignment
EntilZha May 5, 2018
c73777d
Code to generate google sheets answer mapping helper
EntilZha May 6, 2018
20d1977
first round of mapping fixes
ezubaric May 7, 2018
9c5b9b3
Document WikiExtractor command for extracting wikipedia and link script
EntilZha May 7, 2018
a0dedd6
Add documentation for commands to compress for S3, modify path slightly
EntilZha May 7, 2018
fe9d698
Refactor code to prioritize exact matches from expansions
EntilZha May 7, 2018
ec3ba0e
Update answer mapping with priority handling and metrics
EntilZha May 8, 2018
c1097cd
Add mapping validation script
EntilZha May 9, 2018
9df2a1c
Delete bad mappings (#70)
EntilZha May 9, 2018
1fca7ca
spacing
EntilZha May 9, 2018
7614f56
Add disambiguation code
EntilZha May 9, 2018
44ecdd7
Add mappings (#71)
EntilZha May 10, 2018
86e8580
Improve validation script, add in safe disambiguation method
EntilZha May 10, 2018
fa05cd3
Filter out disambugation pages, create place for unmappable questions
EntilZha May 10, 2018
d91add5
Bad mapping
EntilZha May 10, 2018
b1f4bc1
readme on categories sql dump and integrate it
EntilZha May 10, 2018
ed9b36d
I
EntilZha May 10, 2018
8b60960
Reverse bad mapping
EntilZha May 10, 2018
8c901b4
Unicode redirects are causing quite a few issues, so lets not use them
EntilZha May 10, 2018
1f3ea92
More robust word checks
EntilZha May 10, 2018
1d88459
Fold dataset slightly differently to preserve dev/test integrity
EntilZha May 11, 2018
b976d81
Minor update to dataset.py
EntilZha May 17, 2018
cdecee4
non naqt stuff for trick me and pipeline minor fix
EntilZha May 17, 2018
d32a73d
Trick me paper code
EntilZha May 17, 2018
3c0ebab
Update figures to support comparing guessers
EntilZha May 17, 2018
dcf1bf4
Plotting code, supporting code for both journal and trick
EntilZha May 17, 2018
513bdfe
Fix script
EntilZha May 17, 2018
ffe63a2
Make plotting work on non X environments
EntilZha May 17, 2018
91f8bc5
Incorporate second round of train disagreement fixes (#72)
EntilZha May 18, 2018
bfd99f2
Fix disambig error
EntilZha May 18, 2018
d75e68e
Additional fix
EntilZha May 18, 2018
77b2d91
Update more mappings
EntilZha May 19, 2018
4cb5c0d
Don't make preprocessing depend on dataset
EntilZha May 19, 2018
3ce02ce
Revert
EntilZha May 19, 2018
0f318a0
Fix mappings based on validation script
EntilZha May 19, 2018
2622af4
Validation errors: s
EntilZha May 19, 2018
edf19ec
Fix validation: r
EntilZha May 19, 2018
02eacc4
Validation errors
EntilZha May 19, 2018
c9d18d5
Validation errors
EntilZha May 19, 2018
b83bf35
Validation error fixes
EntilZha May 19, 2018
fe5d532
Validation error fixing
EntilZha May 19, 2018
fd55e35
Validation error fixes
EntilZha May 19, 2018
e4993cd
Complete eliminating error mappings
EntilZha May 19, 2018
f88c696
Improve mappings
EntilZha May 20, 2018
8e06524
Publication plots
EntilZha May 20, 2018
268599f
Update format
EntilZha May 21, 2018
1f90540
Attempt to repair answer
EntilZha May 21, 2018
5b12031
Plots
EntilZha May 23, 2018
bce3eef
buzzer working again for PACE expo
ezubaric Jun 1, 2018
2761c65
aaaah
ezubaric Jun 1, 2018
bc8fde3
pace expo
ezubaric Jun 3, 2018
e9d02da
Test set mapping
EntilZha Jun 12, 2018
cd3ac08
Finish mapping stuff
EntilZha Jun 12, 2018
d5d6971
Delete old models
EntilZha Jun 16, 2018
dc66bb6
Fix fold guesser perf issue
EntilZha Jun 17, 2018
2caa21a
Better log location
EntilZha Jun 17, 2018
249423b
elmo implementation
EntilZha Jun 18, 2018
14659e4
Use cuda
EntilZha Jun 18, 2018
1681ad7
X also cuda
EntilZha Jun 18, 2018
68a752a
finish elmo guesser
EntilZha Jun 19, 2018
28fd77c
Fix slurm config
EntilZha Jun 20, 2018
8a4bd6e
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Jun 20, 2018
f554204
Add rnn hyper paramers
EntilZha Jun 20, 2018
f6447a8
Fix elmo guesser
EntilZha Jun 20, 2018
93a85b1
Update hyper
EntilZha Jun 23, 2018
81630f9
Fix write bug
EntilZha Jun 24, 2018
02a9ac9
Update to use scavenger queue
EntilZha Jun 27, 2018
4724eb6
Changes for Trick Me If You Can Interface (#73)
Eric-Wallace-WebHost Jun 27, 2018
cd1c8d1
Fix up hyper parameter stuff
EntilZha Jul 4, 2018
5cc004d
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Jul 4, 2018
ed86dfa
Fix bug in elmo, do gradients correctly, new defaults
EntilZha Jul 10, 2018
e71d534
Fix params
EntilZha Jul 10, 2018
063d96c
Last bad freeze references
EntilZha Jul 10, 2018
d615b0c
Fix cuda error
EntilZha Jul 20, 2018
30a93c4
Fix slurm issues
EntilZha Jul 20, 2018
adff1b2
Make guesser reports (normal) work with slurm
EntilZha Jul 20, 2018
e5dfa2a
Fix and speedup wikipedia loading
EntilZha Jul 20, 2018
e723a29
Full guesser enabled
EntilZha Jul 24, 2018
4ffada7
Add code to choose best guesser
EntilZha Jul 26, 2018
5f97c2d
Better plots, for future make expo runs more granular with char skip
EntilZha Jul 27, 2018
d69ebba
Separate humn and models plot with flag
EntilZha Jul 27, 2018
4247e23
Add anaconda style requirements
EntilZha Aug 29, 2018
679b88e
Category code
EntilZha Sep 10, 2018
5a7f530
Refactor some click code, add vital article fetch
EntilZha Sep 11, 2018
b5b602e
FInish vital article scraping
EntilZha Sep 11, 2018
0b5a7ab
add vital articles to dataset files
EntilZha Sep 11, 2018
55ad17e
Update figures notebook
EntilZha Sep 12, 2018
907d3cc
Change result library
EntilZha Sep 13, 2018
1f0f951
Update to different result library
EntilZha Sep 13, 2018
4df798d
Update for paper generation
EntilZha Sep 22, 2018
ba43ec2
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Sep 22, 2018
a736cbe
Add google ngram data
EntilZha Sep 26, 2018
010b8be
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Sep 26, 2018
5c22ceb
Instructions for how to run our system
EntilZha Oct 4, 2018
6a54b4b
jmlr changes
EntilZha Oct 15, 2018
5622711
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Oct 15, 2018
215d214
Trick pipeline
EntilZha Oct 26, 2018
7e4cb5f
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Oct 26, 2018
dcdd1bc
Code to convert trickme data
EntilZha Oct 26, 2018
7b0b911
Fix mappings
EntilZha Oct 26, 2018
c12095d
update buzzer stuff
ihsgnef Nov 2, 2018
c3df807
Pin
EntilZha Nov 6, 2018
89195e8
update buzzer eval
ihsgnef Nov 8, 2018
f8e95dc
update buzzer eval
ihsgnef Nov 8, 2018
da05c79
updating buzzer eval
ihsgnef Nov 9, 2018
31b90ec
minor
ihsgnef Nov 9, 2018
68ec00e
Remove dash/plotly to make resolution easier
EntilZha Nov 10, 2018
bc5f3c3
Fixed: Model updating parameters on validation (#74)
sweta20 Nov 19, 2018
35c6e73
Update trickme ingestion code
EntilZha Nov 20, 2018
d37c2de
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Nov 20, 2018
bea02f8
buzzer python version (#75)
henryzhao5852 Nov 20, 2018
ce20955
Make changes for trick
EntilZha Nov 30, 2018
2d8b388
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Nov 30, 2018
e8f1040
forgot to update buzzer
ihsgnef Nov 30, 2018
f30a1cf
Merge branch 'master' of github.com:Pinafore/qb
ihsgnef Nov 30, 2018
05e4854
Trick stuff and random cleanup
EntilZha Nov 30, 2018
88f7aff
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Nov 30, 2018
3560ce1
Some logging
EntilZha Nov 30, 2018
4b230bf
Progress
EntilZha Nov 30, 2018
fb4a0ea
Plots close, but rerun code
EntilZha Dec 1, 2018
05869f3
Options!
EntilZha Dec 1, 2018
5691d77
Catch more
EntilZha Dec 1, 2018
433ce8a
Fix issues and add split cli
EntilZha Dec 1, 2018
104ac5b
plotnine to env
EntilZha Dec 1, 2018
65caa8b
Flags to exclude humans
EntilZha Dec 1, 2018
f39490f
Humans
EntilZha Dec 1, 2018
40c11fe
Missing req
EntilZha Dec 3, 2018
662b521
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Dec 3, 2018
0e7d7e6
Use wiki titles to check answers
EntilZha Dec 3, 2018
79b5151
Fix logic error
EntilZha Dec 3, 2018
363a0f2
Google sheets accept
EntilZha Dec 8, 2018
537cbdb
Update
EntilZha Dec 8, 2018
4409c73
Fix dumb anaconda issue
EntilZha Dec 8, 2018
e251a6b
Add round handling
EntilZha Dec 12, 2018
c8d93ba
waiting for Chen's new files
ezubaric Dec 13, 2018
6c90976
working on toy data
ezubaric Dec 14, 2018
b1b756b
reading in files fixed
ezubaric Dec 14, 2018
9d62f4b
reading in files fixed
ezubaric Dec 14, 2018
a1f7603
computer buzzing after neg and at end of sentence not working correctly
ezubaric Dec 14, 2018
0c39d51
human early bug
ezubaric Dec 14, 2018
2bc6042
score reporting
ezubaric Dec 14, 2018
8284657
Update map generation for paper
EntilZha Dec 23, 2018
d921c1c
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Dec 23, 2018
7fccb34
Util for qanta format to qb-api format
EntilZha Jan 16, 2019
eb68f8f
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Jan 16, 2019
22974b7
Handle new plot data
EntilZha Jan 16, 2019
049d1ca
figs
EntilZha Jan 30, 2019
f7f4624
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Jan 30, 2019
4973192
Update figure
EntilZha Jan 30, 2019
8c80337
Moved plot to paper
EntilZha Jan 30, 2019
cc759fb
Separate round 1 and 2, labels
EntilZha Feb 1, 2019
0174db7
Add for future
EntilZha Feb 1, 2019
cc13a81
ug
EntilZha Feb 1, 2019
7bb0db8
Manual colors
EntilZha Feb 1, 2019
b271997
shift
EntilZha Feb 1, 2019
3c9fa0f
Fix duplicate scales
EntilZha Feb 1, 2019
ecf017d
aaag
EntilZha Feb 1, 2019
9c39742
Update rnn.py (#76)
Eric-Wallace Mar 17, 2019
0e57e48
Partially fix
EntilZha Mar 17, 2019
cc86554
Add missing functions
EntilZha Mar 17, 2019
2969a8c
Updates for acf dataset for hs grading
EntilZha Mar 21, 2019
d19990d
DOn't track
EntilZha Mar 21, 2019
bebeb5d
No vs code files
EntilZha Mar 21, 2019
721cd5d
ignore
EntilZha Apr 1, 2019
bf46062
First fixes
EntilZha Apr 2, 2019
203e40d
fix...squash bugs...
EntilZha Apr 2, 2019
75d44da
Final figure changes
EntilZha Apr 2, 2019
088d2e1
Move some figures to script
EntilZha Apr 2, 2019
9454d68
Moving
EntilZha Apr 4, 2019
cda0d62
Changes
EntilZha Apr 15, 2019
3504912
Better
EntilZha Apr 15, 2019
773a195
es interface api (#77)
Eric-Wallace Apr 18, 2019
0334c42
Improve
EntilZha Apr 19, 2019
50b2d74
Merge branch 'master' of github.com:Pinafore/qb
EntilZha Apr 19, 2019
52fbd8d
Figure tweak
EntilZha Apr 29, 2019
5cff5de
trick verification script
EntilZha Apr 30, 2019
3968c2f
Merging code
EntilZha May 7, 2019
4c3196b
Update paths to not use "reserved" expo filename
EntilZha May 7, 2019
caaac0b
experiment updates
EntilZha May 8, 2019
a44e1ac
space
EntilZha May 14, 2019
8f0dd41
Final tacl figure code
EntilZha May 16, 2019
cb18858
updates
ihsgnef Mar 25, 2020
7414285
get buzzer running again
ihsgnef Mar 29, 2020
29f973c
Moved to pinafore-papers
EntilZha Apr 13, 2020
0461585
Delete old files
EntilZha Apr 13, 2020
d94964d
Code for diversity analysis
EntilZha Apr 13, 2020
fa681ff
Don't need the slack notifier anymore
EntilZha Jul 28, 2020
989737a
Update envs
EntilZha Jul 28, 2020
4cff576
CLI for other downloads
EntilZha Jul 28, 2020
fecdf5d
docs
EntilZha Jul 28, 2020
043dc9e
changes
EntilZha Jul 30, 2020
1507eb4
Fix poetry config, delete unused file, add proto
EntilZha Jul 30, 2020
32e79d2
Black code format
EntilZha Jul 30, 2020
19146f2
Fix MRO error by updating dependency
EntilZha Sep 15, 2020
5370ecc
add unicode fixing to question preprocessing (#80)
Maosef Sep 21, 2020
36b62a5
Add prompts (#81)
Maosef Sep 29, 2020
24a0b3d
Add code to generate example
EntilZha Jan 18, 2021
056ba54
Add empirical distribution plot
EntilZha Jan 22, 2021
64b1f2e
Non-working line size
EntilZha Jan 22, 2021
d18e715
Line thickness fix
EntilZha Jan 22, 2021
5301674
Update README for QantaDataset task troubleshooting (#82)
nhatsmrt Mar 3, 2021
2c08401
Update abstract.py (#94)
GeneralPoxter Apr 21, 2021
2efb349
Update torch indexing and use Tensor.cpu() (#91)
GeneralPoxter Apr 21, 2021
a560c96
Update README.md / setup instructions and process (#97)
GeneralPoxter Apr 27, 2021
dd473ef
Update issue templates
EntilZha May 5, 2021
32b5a12
add the required pyspark and pyfunctional dependencies to poetry
ihsgnef May 12, 2021
1f65d2d
Update pyproject.toml (#100)
GeneralPoxter Jun 3, 2021
0c584e4
remove unused quiz_db_session env variable
Nonameentered Jul 22, 2021
b2d2614
update scikit to fix broken install, poetry lock update
Nonameentered Jul 24, 2021
f2d9a6d
update quizdb to match old dataset format
Nonameentered Jul 24, 2021
12da005
update category assignments
Nonameentered Jul 24, 2021
d85238d
add classifier
Nonameentered Jul 29, 2021
b1ad86a
add jenv and egg-info to gitignore
Nonameentered Jul 29, 2021
b3365e2
restore old ds version for now
Nonameentered Aug 6, 2021
9420e57
restore old quizdb dump date for now
Nonameentered Aug 10, 2021
36fc430
Create bad_data.md
ezubaric Oct 24, 2021
c6c4280
Initial dynabench translation file
EntilZha Nov 23, 2021
a5afd0c
Add duplicate checking code
EntilZha Jan 26, 2022
586f13d
Use .join to create string
maldil Jun 27, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
9 changes: 9 additions & 0 deletions .github/ISSUE_TEMPLATE/bad_data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@

What's the Qanta ID of the data?
===========

Can you find other questions with the same issue?
===========

What would be the correct version of the data?
===========
32 changes: 32 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''

---

**Describe the bug**
A clear and concise description of what the bug is.

**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

**Expected behavior**
A clear and concise description of what you expected to happen.

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Desktop (please complete the following information):**
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]

**Additional context**
Add any other context about the problem here.
20 changes: 20 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''

---

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.
26 changes: 25 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
.idea
data/external
data/internal/naqt.db
data/internal/naqt-unused.db
.data/
.vector_cache/
output/
metastore_db/
*.o
Expand All @@ -18,6 +21,8 @@ terraform/ssh-keys/
eip.tf
readable.txt
aws_gpu_override.tf
aws_p2_1_override.tf
aws_p3_2_override.tf
aws_small_override.tf
aws_x1_32_override.tf
aws_x1_16_override.tf
Expand All @@ -30,6 +35,8 @@ security_groups.tf
cynch.json
naqt_db.tf
qanta.hcl
qanta-tmp.yaml
qanta.yaml
qb.egg-info/
qanta_web/.vscode
qanta_web/*.sqlite3
Expand All @@ -42,5 +49,22 @@ build/
dist/
tagme.ipynb
tb-logs/
*.ipynb
.terraform
.ignore
tags
.mypy_cache/
tmp*
luigi/*.pid
*.pid
elasticsearch.yml
*.ipynb
.vscode
pytype_output
output.tar.gz

# jenv
.java-version


# Distribution / packaging
*.egg-info/
Loading