2025
Beyondweb: Lessons From Scaling Synthetic Data for Trillion-Scale Pretraining
P. Maini, V. Dorna, Parth Doshi, A. Carranza, F. Pan, J. Urbanek, P. Burstein, A. Fang, A. Deng, A. Abbas, B. Larsen, C. Blakeney, C. Bannur, C. Baek, D. Teh, D. Schwab, H. Mongstad, H. Yin, J. Wills, K. Mentzer, L. Merrick, R. Monti, R. Adiga, S. Joshi, S. Das, Zhengtao Wang, B. Gaza, A. Morcos, M. Leavitt
Citation Graph
References [0]