-
Notifications
You must be signed in to change notification settings - Fork 4
Feedback #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feedback
Are you sure you want to change the base?
Feedback #1
Conversation
as-is: lr, epoch가 default로 설정되어 실행됨. be-to: lr, epoch 등 커스터마이징이 가능하도록 코드 추가 #5
feat: argument.py 기능 코드 추가
[FEAT] 베이스라인 코드 간소화 #4
transformers.set_seed -> utils_qa.set_seed 로 변경 시드 상수화 및 set_seed 함수 default 인자 값에 의존하도록 변경하여 시드 통일 [FEAT] 베이스라인 코드 간소화 #4
evaluation_strategy -> eval_strategy [FEAT] 베이스라인 코드 간소화 #4
Feat/베이스라인 코드 간소화
- 시드 상수로 관리하지 않고 학습 인자로 처리 - (실행 스크립트의 인자를 통해 시드 관리할 예정) - set_seed 함수 deterministic 인자 추가 [FEAT] 베이스라인 코드 간소화 #4
- model별 다른 return_token_type_ids를 적용 #4
Feat/베이스라인 코드 간소화 - 마무리
Feat/dense eval
- integration_pipeline에 retriever_file명 수정하여 train->test로 자동 전환 옵션 추가 - 오탈자 수정
Exp/rerank rerank 관련 오류 수정 및 전이학습 인자 추가
Exp/analysis retrieval Retrieval 및 Reranker 성능 분석
Feat/dense tuning
[feat]GPT기반앙상블코드구현
merge from Origin/develop to main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.vscode 도 추가해주면 좋습니다
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
기본적으로 github에서 제공하는 python gitignore에서 시작하면 어떨까 합니다
https://github.com/github/gitignore/blob/main/Python.gitignore
| --model_name_or_path klue/bert-base \ | ||
| --config_name None \ | ||
| --tokenizer_name None \ | ||
| \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
중간 \는 불필요해보입니다
| @@ -0,0 +1,70 @@ | |||
| # 실행 방법 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dependency 설치가 빠져 있습니다. python version, cuda 버전 등도 기록해주세요
| ## 통합 파이프라인으로 실행 | ||
| ```bash | ||
| # train-eval-inference 통합 파이프라인 | ||
| python integration_pipline.py ./config/integration.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| python integration_pipline.py ./config/integration.json | |
| python integration_pipeline.py ./config/integration.json |
| @@ -0,0 +1,37 @@ | |||
| { | |||
| "model_name_or_path": "klue/roberta-small", | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
실제 리포트에 기록된 값들과 좀 달라보입니다
| if manual_mode: | ||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
manual_mode로 실행된다는 것을 가벼운 로그로 찍어주면 좋습니다.- 코드 전반적으로 print를 사용하고 있는데 logger 쓰는 것 권장합니다.
- https://github.com/Delgan/loguru 사용하면 쉽게 세팅 가능합니다
- debug, info 레벨로 분리해주면 좋아요
| # elasticsearch 서버 세팅 | ||
| def connect(self): | ||
| es = Elasticsearch( | ||
| "http://localhost:9200", timeout=30, max_retries=10, retry_on_timeout=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"http://localhost:9200" -> 이 값을 처음 init 할때 self.host 와 같은 곳에 담아주세요
| print("Index already exists.") | ||
| return False | ||
|
|
||
| with open(self.setting_path, "r") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| with open(self.setting_path, "r") as f: | |
| with open(self.setting_path, "r", encoding="utf-8") as f: |
| ngram_range=(1, 2), | ||
| max_features=50000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__init__ 의 argument로 빼주세요
| # Data manipulation and analysis | ||
| pandas | ||
| numpy==1.24.1 | ||
|
|
||
| # Machine Learning libraries | ||
| scikit-learn | ||
| torch==1.13 | ||
| datasets==2.15.0 | ||
| transformers==4.25.1 | ||
| faiss-gpu | ||
|
|
||
| # Progress and logging utilities | ||
| tqdm | ||
| wandb | ||
|
|
||
| # Information retrieval libraries | ||
| rank_bm25 | ||
| elasticsearch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
모든 라이브러리는 반드시 버전을 고정해주세요
👋! GitHub Classroom created this pull request as a place for your teacher to leave feedback on your work. It will update automatically. Don’t close or merge this pull request, unless you’re instructed to do so by your teacher.
In this pull request, your teacher can leave comments and feedback on your code. Click the Subscribe button to be notified if that happens.
Click the Files changed or Commits tab to see all of the changes pushed to the default branch since the assignment started. Your teacher can see this too.
Notes for teachers
Use this PR to leave feedback. Here are some tips:
For more information about this pull request, read “Leaving assignment feedback in GitHub”.
Subscribed: @cukminseo @Sujinkim-625 @yeseoLee @hsmin9809 @koreannn @gayeon7877