You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thank you for sharing this interesting and insightful work. I am reaching out to inquire about the perplexity scores for the LLAMA2-7b model as reported in Table 2. I have followed the evaluation script detailed in your repository to assess the model using the settings provided:
I was wondering if you could provide some insight into the observed discrepancy with the reported scores. Is there a particular aspect of the evaluation setup that I might have overlooked, or are such differences within an expected range under certain conditions (e.g., GPU device)?
The text was updated successfully, but these errors were encountered:
Hi, thank you for sharing this interesting and insightful work. I am reaching out to inquire about the perplexity scores for the LLAMA2-7b model as reported in Table 2. I have followed the evaluation script detailed in your repository to assess the model using the settings provided:
The results I obtained were as follows:
Additionally, with a stride of 4, I observed:
I was wondering if you could provide some insight into the observed discrepancy with the reported scores. Is there a particular aspect of the evaluation setup that I might have overlooked, or are such differences within an expected range under certain conditions (e.g., GPU device)?
The text was updated successfully, but these errors were encountered: