| AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs Q Wang, X Zhang, Y Zhang, Q Yi Proceedings of the international conference on high performance computing …, 2013 | 313 | 2013 |
| POET: Parameterized optimizations for empirical tuning Q Yi, K Seymour, H You, R Vuduc, D Quinlan 2007 IEEE International Parallel and Distributed Processing Symposium, 1-8, 2007 | 168 | 2007 |
| Transforming loops to recursion for multi-level memory hierarchies Q Yi, V Adve, K Kennedy Proceedings of the ACM SIGPLAN 2000 conference on Programming language …, 2000 | 113 | 2000 |
| High Performance Fortran compilation techniques for parallelizing scientific codes V Adve, G Jin, J Mellor-Crummey, Q Yi SC'98: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, 11-11, 1998 | 111 | 1998 |
| POET: a scripting language for applying parameterized source‐to‐source program transformations Q Yi Software: Practice and Experience 42 (6), 675-706, 2012 | 80 | 2012 |
| Transforming complex loop nests for locality Q Yi, K Kennedy, V Adve The Journal Of Supercomputing 27 (3), 219-264, 2004 | 76 | 2004 |
| Understanding stencil code performance on multicore architectures SMF Rahman, Q Yi, A Qasem Proceedings of the 8th ACM International Conference on Computing Frontiers, 1-10, 2011 | 67 | 2011 |
| Improving memory hierarchy performance through combined loop interchange and multi-level fusion Q Yi, K Kennedy The International Journal of High Performance Computing Applications 18 (2 …, 2004 | 62 | 2004 |
| Automated empirical tuning of scientific codes for performance and power consumption SF Rahman, J Guo, Q Yi Proceedings of the 6th International Conference on High Performance and …, 2011 | 57 | 2011 |
| Advanced optimization strategies in the Rice dHPF compiler J Mellor‐Crummey, V Adve, B Broom, D Chavarría‐Miranda, R Fowler, ... Concurrency and Computation: Practice and Experience 14 (8‐9), 741-767, 2002 | 48 | 2002 |
| A highly parallel reuse distance analysis algorithm on GPUs H Cui, Q Yi, J Xue, L Wang, Y Yang, X Feng 2012 IEEE 26th International Parallel and Distributed Processing Symposium …, 2012 | 36 | 2012 |
| Semantic-driven parallelization of loops operating on user-defined containers D Quinlan, M Schordan, Q Yi, BR de Supinski International Workshop on Languages and Compilers for Parallel Computing …, 2003 | 30 | 2003 |
| Exploring the optimization space of dense linear algebra kernels Q Yi, A Qasem International Workshop on Languages and Compilers for Parallel Computing …, 2008 | 29 | 2008 |
| Effective use of non-blocking data structures in a deduplication application SD Feldman, A Bhat, P LaBorde, Q Yi, D Dechev Proceedings of the 2013 companion publication for conference on Systems …, 2013 | 28 | 2013 |
| Applying loop optimizations to object-oriented abstractions through general classification of array semantics Q Yi, D Quinlan International Workshop on Languages and Compilers for Parallel Computing …, 2004 | 28 | 2004 |
| Classification and utilization of abstractions for optimization D Quinlan, M Schordan, Q Yi, A Saebjornsen International Symposium On Leveraging Applications of Formal Methods …, 2004 | 26 | 2004 |
| Automatic detection of information leakage vulnerabilities in browser extensions R Zhao, C Yue, Q Yi Proceedings of the 24th International Conference on World Wide Web, 1384-1394, 2015 | 25 | 2015 |
| Automatic blocking of QR and LU factorizations for locality Q Yi, K Kennedy, H You, K Seymour, J Dongarra Proceedings of the 2004 workshop on Memory system performance, 12-22, 2004 | 25 | 2004 |
| Studying the impact of application-level optimizations on the power consumption of multi-core architectures SMF Rahman, J Guo, A Bhat, C Garcia, MH Sujon, Q Yi, C Liao, ... Proceedings of the 9th conference on Computing Frontiers, 123-132, 2012 | 24 | 2012 |
| Vectorization past dependent branches through speculation MH Sujon, RC Whaley, Q Yi Proceedings of the 22nd International Conference on Parallel Architectures …, 2013 | 22 | 2013 |