Haocheng Xi

Cited by

	All	Since 2021
Citations	832	832
h-index	10	10
i10-index	10	10

760

380

190

570

20232024202520264 55 745 26

Public access

View all

1 article

0 articles

available

not available

Based on funding mandates

Haocheng Xi

University of California, Berkeley

Verified email at berkeley.edu - Homepage

Efficient ML


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Humanity's last exam L Phan, A Gatti, Z Han, N Li, J Hu, H Zhang, CBC Zhang, M Shaaban, ... arXiv preprint arXiv:2501.14249, 2025	304	2025
Nvila: Efficient frontier visual language models Z Liu, L Zhu, B Shi, Z Zhang, Y Lou, S Yang, H Xi, S Cao, Y Gu, D Li, X Li, ... Proceedings of the Computer Vision and Pattern Recognition Conference, 4122-4134, 2024	147*	2024
Training transformers with 4-bit integers H Xi, C Li, J Chen, J Zhu Advances in Neural Information Processing Systems 36, 49146-49168, 2023	86	2023
Spargeattn: Accurate sparse attention accelerating any model inference J Zhang, C Xiang, H Huang, J Wei, H Xi, J Zhu, J Chen arXiv preprint arXiv:2502.18137, 2025	85*	2025
Sparse videogen: Accelerating video diffusion transformers with spatial-temporal sparsity H Xi, S Yang, Y Zhao, C Xu, M Li, X Li, Y Lin, H Cai, J Zhang, D Li, J Chen, ... arXiv preprint arXiv:2502.01776, 2025	80	2025
Jetfire: Efficient and accurate transformer pretraining with int8 data flow and per-block quantization H Xi, Y Chen, K Zhao, KJ Teh, J Chen, J Zhu arXiv preprint arXiv:2403.12422, 2024	34	2024
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation S Yang, H Xi, Y Zhao, M Li, J Zhang, H Cai, Y Lin, X Li, C Xu, K Peng, ... arXiv preprint arXiv:2505.18875, 2025	21*	2025
Radial Attention: Sparse Attention with Energy Decay for Long Video Generation X Li, M Li, T Cai, H Xi, S Yang, Y Lin, L Zhang, S Yang, J Hu, K Peng, ... arXiv preprint arXiv:2506.19852, 2025	19*	2025
Coat: Compressing optimizer states and activation for memory-efficient fp8 training H Xi, H Cai, L Zhu, Y Lu, K Keutzer, J Chen, S Han arXiv preprint arXiv:2410.19313, 2024	17	2024
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search Y Gu, Q Hu, S Yang, H Xi, J Chen, S Han, H Cai arXiv preprint arXiv:2508.15884, 2025	12	2025
Oscillation-reduced mxfp4 training for vision transformers Y Chen, H Xi, J Zhu, J Chen arXiv preprint arXiv:2502.20853, 2025	9	2025
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache R Tiwari, H Xi, A Tomar, C Hooper, S Kim, M Horton, M Najibi, ... arXiv preprint arXiv:2502.10424, 2025	7*	2025
T-rex: Text-assisted retrosynthesis prediction Y Liu, H Xu, T Fang, H Xi, Z Liu, S Zhang, H Poon, S Wang arXiv preprint arXiv:2401.14637, 2024	6	2024
Dc-videogen: Efficient video generation with deep compression video autoencoder J Chen, W He, Y Gu, Y Zhao, J Yu, J Chen, D Zou, Y Lin, Z Zhang, M Li, ... arXiv preprint arXiv:2509.25182, 2025	2	2025
StreamDiffusionV2: A Streaming System for Dynamic and Interactive Video Generation T Feng, Z Li, S Yang, H Xi, M Li, X Li, L Zhang, K Yang, K Peng, S Han, ... arXiv preprint arXiv:2511.07399, 2025	1	2025
Dc-gen: Post-training diffusion acceleration with deeply compressed latent space W He, Y Gu, J Chen, D Zou, Y Lin, Z Zhang, H Xi, M Li, L Zhu, J Yu, ... arXiv preprint arXiv:2509.25180, 2025	1	2025
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention J Zhang, H Wang, K Jiang, S Yang, K Zheng, H Xi, Z Wang, H Zhu, ... arXiv preprint arXiv:2509.24006, 2025	1	2025
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation M Maheswaran, R Tiwari, Y Hu, K Dilmen, C Hooper, H Xi, N Lee, ... arXiv preprint arXiv:2512.05033, 2025		2025
XQuant: Breaking the Memory Wall for LLM Inference with KV Cache Rematerialization A Tomar, C Hooper, M Lee, H Xi, R Tiwari, W Kang, L Manolache, ... arXiv preprint arXiv:2508.10395, 2025		2025
Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attention J Zhang, R Su, C Liu, J Wei, Z Wang, H Wang, P Zhang, H Jiang, ...

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by