How to Train Data-Efficient LLMs

this week's paper is How to Train Data-Efficient LLMs. This paper leverages the zero-shot reasoning capabilities of instruction-tuned LLMs to directly assess the quality of training examples in addition to density sampling to create a diverse corpus.