Papperoni

2024

Language Models Scale Reliably With Over-Training and on Downstream Tasks

S. Y. Gadre, G. Smyrnis, Vaishaal Shankar, Suchin Gururangan, M. Wortsman, R. Shao, J. Mercat, A. Fang, Jeffrey Li, S. Keh, R. Xin, Marianna Nezhurina, I. Vasiljevic, Jenia Jitsev, L. Soldaini, Alexandros G. Dimakis, G. Ilharco, P. W. Koh, S. Song, T. Kollar, Y. Carmon, A. Dave, R. Heckel, Niklas Muennighoff, Ludwig Schmidt

Google Scholar

citations

Citation Graph

Loading graph...

References [0]

Sort:

Filter:

No references match the current filters.

Cited by

papers in your library

Cites

papers in your library

Notes