-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducing the results with LLaMa and TriviaQA (Figure 8) #12
Comments
Hey Yasaman,
Thanks for reaching out!
I'm not sure what causes this discrepancy - two directions to look at:
1. Are results for NQ the same as reported?
2. Did you use LLaMA 2? THe results in the paper are for LLaMA 1.
…On Tue, Jul 23, 2024 at 10:55 PM Yasaman Jafari ***@***.***> wrote:
Hi,
Thank you for the excellent paper and for providing the code! I have been
trying to reproduce the results from Figure 8 of the paper using LLaMa-7B
and LLaMa-13B and the TriviaQA dataset I downloaded using the command in
ReadMe.
However, I get the following values:
7B:
0 docs: 50.8, 1 doc: 54.1, 2 docs: 55.9, 3 docs: 56.4
13B:
0 docs: 57.8, 1 doc: 58.8, 2 docs: 59.8, 3 docs: 60.4
Can you please provide some insights/information that explains this
discrepancy?
(The numbers for 1-3 documents are similar but there is a ~3% gap for 0
documents.)
Screenshot.2024-07-23.at.12.29.37.PM.png (view on web)
<https://github.com/user-attachments/assets/f6750b23-43ef-41cc-bf75-d5927fae8dc2>
—
Reply to this email directly, view it on GitHub
<#12>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGNXVETSQE6NG2RF2EAD5PDZN2YK7AVCNFSM6AAAAABLLDAUPKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQZDMMBRGIYTAOI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thank you for your response!
LLaMa1-7B: LLaMa1-13B:
|
Hi again, I just wanted to follow up on this and check if there have been any updates about this discrepancy! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
Thank you for the excellent paper and for providing the code! I have been trying to reproduce the results from Figure 8 of the paper using LLaMa-7B and LLaMa-13B and the TriviaQA dataset I downloaded using the command in ReadMe.
However, I get the following values:
7B:
0 docs: 50.8, 1 doc: 54.1, 2 docs: 55.9, 3 docs: 56.4
13B:
0 docs: 57.8, 1 doc: 58.8, 2 docs: 59.8, 3 docs: 60.4
Can you please provide some insights/information that explains this discrepancy?
(The numbers for 1-3 documents are similar but there is a ~3% gap for 0 documents.)
The text was updated successfully, but these errors were encountered: