| Qera: an analytical framework for quantization error reconstruction C Zhang, JTH Wong, C Xiao, GA Constantinides, Y Zhao ICLR, 2025 | 6 | 2025 |
| A3: an Analytical Low-Rank Approximation Framework for Attention JTH Wong, C Zhang, X Cao, P Gimenes, GA Constantinides, W Luk, ... arXiv preprint arXiv:2505.12942, 2025 | 3 | 2025 |
| Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference H Wu, C Xiao, J Nie, X Guo, B Lou, JTH Wong, Z Mo, C Zhang, P Forys, ... arXiv preprint arXiv:2509.09505, 2025 | 1 | 2025 |
| On the Existence and Behaviour of Secondary Attention Sinks JTH Wong, C Zhang, L Mahon, W Luk, A Isopoussu, Y Zhao arXiv preprint arXiv:2512.22213, 2025 | | 2025 |
| ARIES: Autonomous Reasoning with LLMs on Interactive Thought Graph Environments P Gimenes, Z Cao, J Wong, Y Zhao arXiv preprint arXiv:2502.21208, 2025 | | 2025 |