If you attempt to run any of the example code from the WikiConv documentation, e.g., wikiconv_russian_2004 = Corpus(filename=download("wikiconv-russian-2004")), you will find that the code immediately crashes with a 404 error. From my debugging, this is ultimately caused by a simple typo in util.py: the _get_wikiconv_year_info function builds the download URL with the string "corpus_zipped", when the actual URL on the zissou server is "corpus-zipped" (dash instead of underscore)
Steps to reproduce
- Run
download("wikiconv-russian-2004") (or any other wikiconv corpus)
- Observe that this immediately dies with a 404 error
Additional information
This was tested on the latest ConvoKit (4.1.1) running in a Python 3.11.15 conda environment on a Linux server (but, the typo still exists as of the most recent commits on the ConvoKit GitHub).
If you attempt to run any of the example code from the WikiConv documentation, e.g.,
wikiconv_russian_2004 = Corpus(filename=download("wikiconv-russian-2004")), you will find that the code immediately crashes with a 404 error. From my debugging, this is ultimately caused by a simple typo inutil.py: the_get_wikiconv_year_infofunction builds the download URL with the string"corpus_zipped", when the actual URL on the zissou server is"corpus-zipped"(dash instead of underscore)Steps to reproduce
download("wikiconv-russian-2004")(or any other wikiconv corpus)Additional information
This was tested on the latest ConvoKit (4.1.1) running in a Python 3.11.15 conda environment on a Linux server (but, the typo still exists as of the most recent commits on the ConvoKit GitHub).