2026

Mamba-3: Improved Sequence Modeling Using State Space Principles

Aakash Lahoti, Kevin Y. Li, Berlin Chen, Caitlin Wang, Aviv Bick, J. Zico Kolter, Tri Dao, Albert Gu

citations

Cite Score

2

AI summary

This paper introduces Mamba-3, an improved state space model (SSM) using exponential-trapezoidal discretization, complex-valued SSMs, and multi-input, multi-output (MIMO) formulation, achieving significant gains in retrieval, state-tracking, and language modeling while improving inference efficiency.

Main Contributions

  • Introduces Exponential-Trapezoidal Discretization, a novel technique for discretizing time-varying, selective SSMs, which formalizes previous heuristic discretizations and provides a more expressive generalization.
  • Proposes a Complex-valued State Space Model for Mamba-3, enabling richer state tracking and overcoming limitations of real-valued linear models on synthetic tasks.
  • Develops a Multi-Input, Multi-Output (MIMO) SSM formulation to improve FLOP efficiency during decoding, increasing arithmetic intensity without compromising decoding speed.
  • Achieves significant gains in average downstream language modeling accuracy (up to 1.8 points at 1.5B scale) compared to baselines like Gated DeltaNet and Mamba-2.
  • Demonstrates improved hardware utilization and comparable perplexity to Mamba-2 with half its state size, advancing the performance-efficiency frontier.

Abstract

Scaling inference-time compute has emerged as an important driver of LLM performance, making inference efficiency a central focus of model design alongside model quality. While the current Transformer models deliver strong model quality, their quadratic compute and linear memory makes inference expensive. This has spurred the development of sub-quadratic models with reduced linear compute and constant memory requirements. However, many recent linear models trade off model quality and capability for algorithmic efficiency, failing on tasks such as state tracking. Moreover, their theoretically linear inference remains hardware-inefficient in practice. Guided by an inference-first perspective, we introduce three core methodological improvements inspired by the state space model (SSM) viewpoint of linear models. We combine: (1) a more expressive recurrence derived from SSM discretization, (2) a complex-valued state update rule that enables richer state tracking, and (3) a multi-input, multi-output (MIMO) formulation for better model performance without increasing decode latency. Together with architectural refinements, our Mamba-3 model achieves significant gains across retrieval, state-tracking, and downstream language modeling tasks. At the 1.5B scale, Mamba-3 improves average downstream accuracy by 0.6 percentage points compared to the next best model (Gated DeltaNet), with Mamba-3's MIMO variant further improving accuracy by another 1.2 points for a total 1.8 point gain. Across state-size experiments, Mamba-3 achieves comparable perplexity to Mamba-2 despite using half of its predecessor's state size. Our evaluations demonstrate Mamba-3's ability to advance the performance-efficiency frontier.

Citation Graph

Loading graph...

References [58]

Sort:
Filter:

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin - 2017

47 papers in library cite

D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014

59 papers in library cite

Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord - 2018

5 papers in library cite

Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, Yejin Choi - 2019

6 papers in library cite

P. Rajpurkar, R. Jia, Percy Liang - 2018

14 papers in library cite

M. Joshi, E. Choi, D. Weld, Luke Zettlemoyer - 2017

18 papers in library cite

D. Paperno, German Kruszewski, A. Lazaridou, N. Q. Pham, R. Bernardi, S. Pezzelle, M. Baroni, G. Boleda, Raquel Fernandez - 2016

12 papers in library cite

Albert Gu, Tri Dao - 2024

1 paper in library cites

A. Grattafiori, A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. A. Dahle, A. Letman, A. Mathur, A. Schelten, A. Vaughan, A. Yang, A. Fan, A. G. A. P. Goyal, A. Hartshorn, A. Yang, A. Mitra, A. Sravankumar, A. Korenev, A. Hinsvark, Abhishek Rao, A. Zhang, A. Rodriguez, A. Gregerson, A. Spataru, B. Roziere, B. Biron, B. Tang, B. Chern, C. Caucheteux, C. Nayak, C. Bi, C. Marra, C. Mcconnell, C. Keller, C. Touret, Chiyu Wu, C. Wong, C. C. Ferrer, C. Nikolaidis, D. Allonsius, Dawn Song, D. Pintz, D. Livshits, D. Wyatt, D. Esiobu, D. Choudhury, D. Mahajan, D. G. Olano, D. Perino, Dieuwke Hupkes, E. Lakomkin, E. Albadawy, E. Lobanova, E. Dinan, E. M. Smith, F. Radenovic, F. Guzman, F. Zhang, Gabriel Synnaeve, G. Lee, G. L. Anderson, G. Thattai, G. Nail, G. Mialon, G. Pang, G. Cucurell, H. Nguyen, H. Korevaar, Hu Xu, Hugo Touvron, I. Zarov, I. A. Ibarra, I. Kloumann, Ishan Misra, I. Evtimov, J. Zhang, J. Copet, Jaehoon Lee, J. Geffert, J. Vranes, J. Park, J. Mahadeokar, J. Shah, J. V. D. Linde, J. Billock, J. H. Hong, Jaehoon Lee, J. Fu, J. Chi, J. Huang, Joseph Liu, J. Wang, J. Yu, J. Bitton, J. Spisak, J. Park, J. Rocca, J. Johnstun, J. Saxe, J. Jia, K. V. Alwala, K. Prasad, K. Upasani, K. Plawiak, K. Li, K. Heafield, K. Stone, K. E. Arini, K. Iyer, K. Malik, K. Chiu, K. Bhalla, K. Lakhotia, L. R. Yeary, Laurens Van Der Maaten, L. C. Chen, L. Tan, L. Jenkins, L. Martin, L. Madaan, L. Malo, L. Blecher, L. Landzaat, L. D. Oliveira, M. Muzzi, M. Pasupuleti, M. Singh, M. Paluri, M. Kardas, M. Tsimpoukelli, M. Oldham, M. Rita, M. Pavlova, M. Kambadur, Martha Lewis, M. Si, M. K. Singh, M. Hassan, N. Goyal, N. Torabi, N. Bashlykov, N. Bogoychev, N. Chatterji, N. Zhang, O. Duchenne, O. Celebi, P. Alrassy, Peizhao Zhang, P. L. Li, P. Vasic, Paul Weng, P. Bhargava, P. Dubal, P. Krishnan, P. S. Koura, P. Xu, Q. He, Q. Dong, R. Srinivasan, R. Ganapathy, R. Calderer, R. S. Cabral, R. Stojnic, Roberta Raileanu, R. Maheswari, R. Girdhar, R. Patel, R. Sauvestre, R. Polidoro, R. Sumbaly, R. Taylor, R. Silva, R. Hou, R. Wang, S. Hosseini, S. Chennabasappa, Shivalika Singh, S. Bell, S. S. Kim, S. Edunov, S. Nie, S. Narang, S. Raparthy, S. Shen, S. Wan, S. Bhosale, S. Zhang, S. Vandenhende, S. Batra, S. Whitman, S. Sootla, S. Collot, Suchin Gururangan, S. Borodinsky, T. Herman, T. Fowler, T. Sheasha, T. Georgiou, T. Scialom, T. Speckbacher, T. Mihaylov, T. Xiao, U. Karn, V. Goswami, V. Gupta, V. Ramanathan, V. Kerkez, V. Gonguet, V. Do, V. Vogeti, V. Albiero, V. Petrovic, W. Chu, W. Xiong, W. Fu, W. Meers, X. Martinet, Xinpeng Wang, Xinpeng Wang, X. E. Tan, X. Xia, X. Xie, X. Jia, Xinpeng Wang, Y. Goldschlag, Y. Gaur, Y. Babaei, Y. Wen, Yueqi Song, Y. Z. Zhang, Yiwei Li, Y. Mao, Z. D. Coudert, Zhicheng Yan, Ziru Chen, Z. Papakipos, A. Singh, Aarohi Srivastava, A. Jain, A. Kelsey, A. Shajnfeld, A. Gangidi, A. Victoria, A. Goldstand, A. Menon, Archit Sharma, A. Boesenberg, A. Baevski, A. Feinstein, A. Kallet, A. Sangani, A. Teo, A. Yunus, A. Lupu, A. Alvarado, A. Caples, Albert Gu, A. Ho, A. Poulton, A. Ryan, A. Ramchandani, A. Dong, A. Franco, A. G. A. P. Goyal, A. Saraf, A. Chowdhury, A. Gabriel, A. Bharambe, A. Eisenman, A. Yazdan, B. James, B. Maurer, B. Leonhardi, B. Huang, B. Loyd, B. D. Paola, B. Paranjape, Bing Liu, Bo Wu, B. Ni, B. Hancock, B. Wasti, B. Spence, B. Stojkovic, B. Gamido, B. Montalvo, C. Parker, C. Burton, C. Mejia, C. L. Liu, Caitlin Wang, Christina Kim, Chang Zhou, Changran Hu, C. H. Chu, Carrie Cai, C. Tindal, C. Feichtenhofer, C. Gao, D. Civin, D. Beaty, D. Kreymer, Dustin Li, D. Adkins, D. X. Xu, D. Testuggine, D. David, D. Parikh, D. Liskovich, D. Foss, D. Wang, D. Le, D. Holland, E. Dowling, E. Jamil, E. Montgomery, E. Presani, E. Hahn, E. Wood, E. T. Le, E. Brinkman, E. Arcaute, E. Dunbar, E. Smothers, F. Sun, F. Kreuk, F. Tian, F. Kokkinos, F. Ozgenel, F. Caggioni, F. Kanayet, F. Seide, G. M. Florez, G. Schwarz, G. Badeer, G. Swee, G. Halpern, G. Herman, G. Sizov, G. Zhang, G. Lakshminarayanan, H. Inan, H. Shojanazeri, H. Zou, Haiming Wang, H. Zha, H. Habeeb, H. Rudolph, H. Suk, H. Aspegren, H. Goldman, H. Zhan, I. Damlaj, I. Molybog, I. Tufanov, I. Leontiadis, I. E. Veliche, I. Gat, J. Weissman, J. Geboski, J. Kohli, J. Lam, J. Asher, J. B. Gaya, J. Marcus, Jie Tang, J. Chan, J. Zhen, J. Reizenstein, J. Teboul, J. Zhong, J. Jin, Jihan Yang, J. Cummings, J. Carvill, J. Shepard, J. Mcphie, J. Torres, J. Ginsburg, J. Wang, K. Wu, U. K. Hou, K. Saxena, K. Khandelwal, K. Zand, K. Matosich, K. Veeraraghavan, K. Michelena, K. Li, K. Jagadeesh, K. H. Huang, K. Chawla, K. H. Huang, L. C. Chen, L. Garg, A. Lavender, L. Silva, L. Bell, Li Zhang, L. Guo, Longhui Yu, L. Moshkovich, L. Wehrstedt, M. Khabsa, M. Avalani, M. Bhatt, M. Mankus, M. Hasson, M. Lennie, M. Reso, M. Groshev, M. Naumov, M. Lathi, M. Keneally, Mickel Liu, M. L. Seltzer, Michal Valko, M. Restrepo, M. Patel, M. Vyatskov, M. Samvelyan, M. Clark, M. Macey, Mingliang Wang, M. J. Hermoso, M. Metanat, M. Rastegari, Mohit Bansal, N. Santhanam, N. Parks, N. White, N. Bawa, N. Singhal, N. Egebo, Nicolas Usunier, N. Mehta, N. P. Laptev, N. Dong, Newton Cheng, O. Chernoguz, O. Hart, O. Salpekar, O. Kalinli, P. Kent, P. Parekh, P. Saab, P. Balaji, P. Rittner, P. Bontrager, P. Roux, Piotr Dollar, P. Zvyagina, P. Ratanchandani, P. Yuvraj, Q. Liang, R. Alao, R. Rodriguez, R. Ayub, R. Murthy, R. Nayani, R. Mitra, R. Parthasarathy, R. Li, R. Hogan, R. Battey, R. Wang, Russell Howes, R. Rinott, S. Mehta, S. Siby, S. J. Bondu, S. Datta, S. Chugh, S. Hunt, S. Dhillon, S. Sidorov, Siyuan Pan, S. Mahajan, S. Verma, S. Yamamoto, S. Ramaswamy, S. Lindsay, S. Lindsay, S. Feng, Stephen Lin, S. C. Zha, S. Patil, S. Shankar, S. Zhang, S. Zhang, Shijie Wang, Sandhini Agarwal, S. Sajuyigbe, S. Chintala, S. Max, S. Chen, S. Kehoe, S. Satterfield, S. Govindaprasad, S. Gupta, S. Deng, S. Cho, S. Virk, S. Subramanian, S. Choudhury, S. Goldman, T. Remez, T. Glaser, T. Best, T. Koehler, Tony Robinson, Tao Li, Tong Zhang, T. Matthews, T. Chou, T. Shaked, V. Vontimitta, V. Ajayi, V. Montanez, V. Mohan, V. S. Kumar, V. Mangla, V. Ionescu, V. Poenaru, V. T. Mihailescu, V. Ivanov, Wentao Li, Wenyi Wang, W. Jiang, W. Bouaziz, W. Constable, X. Tang, Xiaobao Wu, Xinpeng Wang, Xiaobao Wu, X. Gao, Y. Kleinman, Yanru Chen, Y. Hu, Y. Jia, Y. Qi, Yiwei Li, Y. Z. Zhang, Y. Z. Zhang, Y. Adi, Y. Nam, Yu, Wang, Y. Zhao, Yiding Hao, Y. Qian, Yiwei Li, Yun He, Z. Rait, Z. Devito, Z. Rosnbrick, Z. Wen, Zhilin Yang, Zhuoye Zhao, Z. Ma - 2024

2 papers in library cite

W. Kwon, Zhiyuan Li, S. Zhuang, Y. Sheng, L. Zheng, C. H. Yu, Joseph Gonzalez, Haowei Zhang, Ion Stoica - 2023

5 papers in library cite

Jianlin Su, M. Ahmed, Y. Lu, Siyuan Pan, W. Bo, Yibo Liu - 2024

2 papers in library cite

K. Sakaguchi, R. L. Bras, C. Bhagavatula, Yejin Choi - 2019

4 papers in library cite

A. Katharopoulos, A. Vyas, N. Pappas, F. Fleuret - 2020

1 paper in library cites

Yonatan Bisk, Rowan Zellers, R. L. Bras, Jianfeng Gao, Yejin Choi - 2019

5 papers in library cite

A. Yang, A. Li, B. Yang, B. Zhang, B. Hui, Bo Zheng, B. Yu, C. Gao, C. Huang, C. Lv, C. Zheng, D. Liu, F. Zhou, F. Huang, F. Hu, H. Ge, H. Wei, Haowei Lin, Jie Tang, Jihan Yang, J. Tu, J. Zhang, Jihan Yang, Jihan Yang, Jingren Zhou, Jingren Zhou, Junyang Lin, K. Dang, K. Bao, K. Yang, Longhui Yu, L. Deng, M. Li, M. Xue, M. Li, Peizhao Zhang, Peng Wang, Qihao Zhu, R. Men, R. Gao, Shuming Liu, S. Luo, Tao Li, T. Tang, W. Yin, Xiang Ren, Xinpeng Wang, X. Zhang, Xiang Ren, Yu Fan, Yu Su, Y. Z. Zhang, Y. Z. Zhang, Y. Wan, Yibo Liu, Zhengtao Wang, Z. Cui, Zhengyou Zhang, Zijian Zhou, Z. Qiu - 2025

5 papers in library cite

C. Snell, Jaehoon Lee, K. Xu, A. Kumar - 2024

4 papers in library cite

C. P. Hsieh, S. Sun, S. Kriman, S. Acharya, D. Rekesh, F. Jia, Y. Z. Zhang, B. Ginsburg - 2024

1 paper in library cites

D. Dua, Yuzhi Wang, P. Dasigi, G. Stanovsky, Shivalika Singh, Matt Gardner - 2019

4 papers in library cite

T. Mihaylov, Peter Clark, Tushar Khot, Ashish Sabharwal - 2018

6 papers in library cite

Leo Gao, J. Tow, B. Abbasi, Stella Biderman, S. Black, A. Dipofi, C. Foster, L. Golding, J. Hsu, A. L. Noac'h, H. Li, Kyle Mcdonell, Niklas Muennighoff, C. Ociepa, Jason Phang, Laria Reynolds, H. Schoelkopf, A. Skowron, L. Sutawika, Eric Tang, A. Thite, B. Wang, K. Wang, Andy Zou - 2024

3 papers in library cite

R. Waleffe, W. Byeon, D. Riach, B. Norick, V. Korthikanti, Tri Dao, Albert Gu, A. Hatamizadeh, Shivalika Singh, D. Narayanan, G. Kulshreshtha, V. Singh, J. Casper, J. Kautz, M. Shoeybi, Bryan Catanzaro - 2024

1 paper in library cites

E. Suli, D. F. Mayers - 2003

1 paper in library cites

A. Yu, N. B. Erichson - 2025

1 paper in library cites

Jiaxi Hu, Y. Pan, J. Du, D. Lan, X. Tang, Q. Wen, Yiqing Liang, W. Sun - 2025

1 paper in library cites

Aman Gupta, Albert Gu, Jonathan Berant - 2022

1 paper in library cites

Albert Gu, K. Goel, C. Re - 2022

1 paper in library cites

A. Tandon, K. Dalal, Xiang Lisa Li, D. Koceja, M. Rod, S. Buchanan, Xinpeng Wang, J. Leskovec, S. Koyejo, Tatsunori Hashimoto, C. Guestrin, J. Mccaleb, Yejin Choi, Y. S. Sun - 2025

1 paper in library cites

Shusheng Yang, J. Kautz, A. Hatamizadeh - 2025

1 paper in library cites

Shusheng Yang, B. Wang, Y. Shen, R. Panda, Yoon Kim - 2024

1 paper in library cites

T. H. Team, A. Liu, B. Zhou, Chenfeng Xu, Chang Zhou, Chiyuan Zhang, Chenfeng Xu, Caitlin Wang, D. Wu, D. Wu, D. Jiao, D. Du, D. Wang, F. Zhang, F. Lian, G. Xu, G. Zhang, Haiming Wang, H. Luo, Han Hu, Hu Xu, Jeffrey Wu, Jiacheng Zhu, Junjie Yan, Jiacheng Zhu, J. Zhang, J. Xue, J. Xia, James Zheng, K. Liu, K. Zhang, K. Zheng, K. Li, K. Wang, L. Jiang, L. Liu, L. Wu, M. Huang, P. Yu, Peng Wang, Q. Wang, Q. Xiang, Qian Liu, Q. Sun, R. Guo, Ruobing Xie, Shusheng Yang, S. Chen, S. Hu, Shanda Li, Shanda Li, S. Chen, S. Zheng, T. Yang, Tong Zhang, Tao Yu, W. Han, Weizhou Liu, W. Zhou, Wenyi Wang, Weizhu Chen, X. Feng, Xiang Ren, X. Sun, X. Kuang, X. Huang, X. Cao, Yanru Chen, Yulun Du, Zhilin Yang, Y. Tao, Y. Deng, Y. Shen, Y. Hong, Yanru Chen, Y. Huang, Y. Deng, Y. Mao, Yuzhi Wang, Y. Zeng, Zhiwei Xu, Z. Kang, Zhuoye Zhao, Zhicheng Yan, Z. Fang, Z. Hu, Ziru Chen, Zhiyuan Li, Zhiyuan Li, A. Yan, A. Liang, Bing Liu, B. Pan, B. Xing, Bo Wu, B. Qu, B. Ni, Bo Wu, Chun-Liang Li, C. Jiang, Chiyuan Zhang, C. L. Liu, C. Yang, Chenfeng Xu, Caitlin Wang, C. Zha, D. Yi, D. Wang, F. Lu, F. Chen, F. Liu, F. Zheng, G. Yu, G. Li, Gloria Wang, Haowei Lin, Haozhe Liu, Haiming Wang, H. Fei, H. Lu, H. Jiang, Huan Sun, H. Zhu, H. Dai, H. Chen, H. Feng, H. Cai, H. Peng, J. Lv, J. Shi, J. Bu, Jeffrey Li, Jiaxi Hu, J. Guan, Jiacheng Xu, J. Cai - 2025

1 paper in library cites

Yonghui Wu, Z. Sun, Shanda Li, S. Welleck, Yining Yang - 2025

1 paper in library cites

Anthropic - 2026

1 paper in library cites

Openai - 2026

1 paper in library cites

R. Grazzi, J. Siems, S. Schrodi, T. Brox, Frank Hutter - 2024

1 paper in library cites

S. Arora, A. Timalsina, A. Singhal, B. Spector, S. Eyuboglu, Xuandong Zhao, Abhishek Rao, A. Rudra, C. Re - 2024

1 paper in library cites

Y. S. Sun, Xiang Lisa Li, K. Dalal, Jiacheng Xu, A. Vikram, G. Zhang, Y. Dubois, X. Chen, Xinpeng Wang, S. Koyejo, Tatsunori Hashimoto, C. Guestrin - 2025

1 paper in library cites

I. Schlag, K. Irie, Jürgen Schmidhuber - 2021

1 paper in library cites

Boxuan Li, Y. Jiang, V. Gadepally, D. Tiwari - 2024

1 paper in library cites

Akul Arora, N. Rathi, N. R. Selvam, R. Csordas, Dan Jurafsky, Christopher Potts - 2025

1 paper in library cites

Kimi Team, Y. Z. Zhang, Zongyu Lin, Xingcheng Yao, Jiaxi Hu, Fanqing Meng, C. L. Liu, Xin Men, Shusheng Yang, Zhiyuan Li, Wentao Li, Enzhe Lu, Weizhou Liu, Yanru Chen, Weixin Xu, Longhui Yu, Yuzhi Wang, Yu Fan, Longguang Zhong, Enming Yuan, Danyang Zhang, Y. Z. Zhang, T. Y. Liu, Haiming Wang, Shengjun Fang, Weiran He, Shuming Liu, Yiwei Li, Jianlin Su, Jiezhong Qiu, Bo Pang, J. J. Zhang, Junxiao Song, J. J. Jiang, Joseph Liu, Jihan Yang, J. Zhang, J. Lv, J. Zhao, Jeffrey Li, Joseph Liu, J. Zhao, J. Guo, K. Wang, K. Wu, L. Fu, Luheng He, Lisa Wang, L. Liu, L. Dong, L. Zhan, L. C. Cheng, L. Xu, M. Zheng, Mickel Liu, M. Hu, N. Chen, P. C. Chen, Pengcheng He, P. Pan, P. Wei, Qiang Yang, Q. Yi, R. Wang, R. Chen, Ruiyang Sun, Rylan Yang, R. Chen, R. Zhou, S. Zhang, S. Zhang, S. Xu, S. Chang, Shuming Liu, Shijie Wang, S. Feng, S. Yuan, Tong Zhang, T. Lang, Tao Li, W. Deng, Wentao Li, Wenyi Wang, Wenxuan Zhang, W. Sun, Wanli Ouyang, W. Jiao, W. Sun, W. Jia, X. Zhang, X. He, Xiang Ren, X. Zhu, X. Guo, Xiang Lisa Li, X. Ma, X. Lu, X. Feng, X. Huang, X. Guan, Xiang Lisa Li, X. Zhang, X. Gao, X. Luo, X. Qi, Yanru Chen, Y. Tao, Y. Xiao, Y. Mai, Yanru Chen, Y. Ding, Yining Yang, Yueqi Song, Yining Yang, Yuxuan Zhu, Yonghui Wu, Yibo Liu, Yining Yang, Y. Cai, Y. Tu, Y. Z. Zhang, Y. Huang, Y. Zhou, Y. Jiang, Yibo Liu, Y. Hu, Yutong Lin, Yining Yang, Yuzhi Wang, Y. Z. Zhang, Ziyi Wu, Zhengyou Zhang, Z. Yu, Zhilin Yang, Zhuoye Zhao, Zhiyuan Li, Zhongqiang Huang, Ze Liu, Zhiwei Xu, Z. Kui, Z. Zeng, Z. Xiong, Z. Han, Ziyi Wu, Z. Geng, Zhuoye Zhao, Z. Tang, Z. Zhu, Z. Zhu, Zhiwei Xu - 2025

1 paper in library cites

Nvidia - 2022

1 paper in library cites

Albert Gu, Aman Gupta, K. Goel, C. Re - 2022

1 paper in library cites

Shusheng Yang, B. Wang, Y. Z. Zhang, Y. Shen, Yoon Kim - 2025

1 paper in library cites

A. Henry, P. R. Dachapally, S. Pawar, Yanru Chen - 2020

1 paper in library cites

Y. S. Sun, L. Dong, S. Huang, S. Ma, Y. Xia, J. Xue, J. Wang, F. Wei - 2023

1 paper in library cites

K. Choromanski, V. Likhosherstov, David Dohan, X. Song, A. Gane, T. Sarlos, P. Hawkins, J. Davis, A. Mohiuddin, Lukasz Kaiser, D. Belanger, L. Colwell, A. Weller - 2022

1 paper in library cites

B. Yang, B. Venkitesh, D. Talupuru, Haowei Lin, D. Cairuz, Phil Blunsom, A. Locatelli - 2025

1 paper in library cites

L. Cabannes, M. Beck, G. Szilvasy, M. Douze, M. Lomeli, J. Copet, P. E. Mazare, Gabriel Synnaeve, Hervé Jégou - 2025

1 paper in library cites

S. Arora, S. Eyuboglu, Mingchuan Zhang, A. Timalsina, S. Alberti, D. Zinsley, James Zou, A. Rudra, C. Re - 2025

1 paper in library cites

J. T. H. Smith, A. Warrington, S. W. Linderman - 2023

1 paper in library cites

M. Wortsman, P. J. Liu, L. Xiao, K. Everett, A. A. Alemi, B. Adlam, J. D. C. Reyes, I. Gur, A. Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl Dickstein, K. Xu, Jaehoon Lee, J. Gilmer, S. Kornblith - 2023

1 paper in library cites

Tong Zhang, S. Bi, Y. Hong, K. Zhang, F. Luan, Shusheng Yang, K. Sunkavalli, William T. Freeman, H. Tan - 2025

1 paper in library cites

Y. Sarrof, Y. Veitsman, M. Hahn - 2024

1 paper in library cites

G. Penedo, H. Kydlicek, L. B. Allal, A. Lozhkov, M. Mitchell, Colin Raffel, L. V. Werra, Thomas Wolf - 2024

1 paper in library cites

W. Merrill, J. Petty, Ashish Sabharwal - 2025

1 paper in library cites

R. Grazzi, J. Siems, A. Zela, J. K. H. Franke, Frank Hutter, M. Pontil - 2025

1 paper in library cites

Cited by

0

papers in your library

Cites

19

papers in your library

Read

on April 18, 2026

Your review

Tags

ICLR2026

Paper Aliases

No aliases