2024
Language Models Scale Reliably With Over-Training and on Downstream Tasks
S. Y. Gadre, G. Smyrnis, Vaishaal Shankar, Suchin Gururangan, M. Wortsman, R. Shao, J. Mercat, A. Fang, Jeffrey Li, S. Keh, R. Xin, Marianna Nezhurina, I. Vasiljevic, Jenia Jitsev, L. Soldaini, Alexandros G. Dimakis, G. Ilharco, P. W. Koh, S. Song, T. Kollar, Y. Carmon, A. Dave, R. Heckel, Niklas Muennighoff, Ludwig Schmidt
Citation Graph
References [0]