-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail to reproduce the work #32
Comments
May I ask what is the allennlp version in this project? I tried 2.2.0 and 0.9.0, but all lead to errors. |
I tried using the pinned version (specified in environment.yml), and that also failed with the error shared above. Please provide a working environment.yml. |
I think there might be an issue with the datasets that are publicly available? ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden |
@gmarcial44 are you using the latest-allennlp branch? if so, I was able to get around this issue by replacing the NER_DATASETS = {
"ncbi": {
"data_dir": "/home/suching/scibert/data/ner/NCBI-disease/",
},
"sciie": {
"data_dir": "/home/suching/scibert/data/ner/sciie/"
},
"jnlpba": {
"data_dir": "/home/suching/scibert/data/ner/JNLPBA/"
},
"bc5cdr": {
"data_dir": "/home/suching/scibert/data/ner/bc5cdr/"
}
}
CLASSIFICATION_DATASETS = {
"chemprot": {
"data_dir": "https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/chemprot/",
"dataset_size": 4169
},
"rct-20k": {
"data_dir": "https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/rct-20k/",
"dataset_size": 180040
},
"rct-sample": {
"data_dir": "https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/rct-sample/",
"dataset_size": 500
},
"citation_intent": {
"data_dir": "https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/citation_intent/",
"dataset_size": 1688
},
"sciie": {
"data_dir": "https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/sciie/",
"dataset_size": 3219
},
"ag": {
"data_dir": "https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/ag/",
"dataset_size": 115000
},
"hyperpartisan_news": {
"data_dir": "https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/hyperpartisan_news/",
"dataset_size": 500
},
"imdb": {
"data_dir": "https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/imdb/",
"dataset_size": 20000
},
"amazon": {
"data_dir": "https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/amazon/",
"dataset_size": 115251
}
}
DATASETS = {"NER": NER_DATASETS, "CLASSIFICATION": CLASSIFICATION_DATASETS} |
Could you please check the implementation steps you provided in the README file?
I followed your instructions but find it very hard to reproduce this work, someerrors would come out like version inconsistency between allennlp and transformers, then lead to error like:
subprocess.CalledProcessError: Command 'allennlp train training_config/classifier.jsonnet --include-package dont_stop_pretraining -s model_logs\citation_intent_base' returned non-zero exit status 1.
Or just there are some wrong steps during my implementation? It is really confusing and frustrating.
The text was updated successfully, but these errors were encountered: