Xiaodong Yu

Assistant Professor

Charles V. Schaefer, Jr. School of Engineering and Science

Department of Computer Science

Gateway Center N411

(201) 216-5649

[email protected]

Website

Education

PhD (2019) Virginia Tech (Computer Science)

Research

Parallel and Distributed Computing and Systems, Next-Generation AI Hardware, High-Performance MLSys (supporting LLM, GNN, and more), Communication and Privacy in Federated Learning

General Information

Xiaodong Yu is an Assistant Professor in the Department of Computer Science at Stevens Institute of Technology since 2023, where he leads the Advanced Parallel and distributEd Computing and Systems (APECS) lab. Prior to joining Stevens, he was an Assistant Computer Scientist in the Mathematics and Computer Science (MCS) Division at Argonne National Laboratory from 2019 to 2023. He also held an appointment as a Scientist-at-Large with the Consortium for Advanced Science and Engineering (CASE) at the University of Chicago. He earned his Ph.D. in Computer Science from Virginia Tech in 2019. His research interests span parallel and distributed algorithms, systems, and architectures. His work has resulted in over 50 peer-reviewed publications in top-tier HPC venues such as HPDC, ICS, and SC.
Dr. Yu is the PI of an NSF CRII award and an Argonne Laboratory Directed Research and Development (LDRD) project. He also served as the technical lead on several U.S. Department of Energy (DOE) projects. He has supervised more than 10 Ph.D. and undergraduate research interns at Argonne and currently advises five Ph.D. students at Stevens. He has been actively serving on the organizing and technical program committees of leading conferences, including ICS, SC, and IPDPS, and is a review board member for IEEE Transactions on Parallel and Distributed Systems (TPDS).

Experience

Stevens Institute of Technology, Hoboken, NJ
Assistant Professor 2023 - Current

Argonne National Laboratory, Lemont, IL
Guest Faculty 2024 - Current
Assistant Computer Scientist 2019 - 2023

The University of Chicago Consortium for Advanced Science and Engineering, Chicago, IL
Scientist-at-Large 2022 - 2023

AMD, Austin, TX
Software Engineer (Intern) Summer 2017

Institutional Service

Research Computing Services Committee Member
CS Tenure-Track Faculty Search Committee Member

Professional Service

”High-Performance Computing for AI: Architecture, Systems, and Algorithms”, a special issue of Electronics (ISSN 2079-9292) Lead Guest Editor
Supercomputing Asia and International Conference on High Performance Computing in Asia Pacific Region (SCA/HPCAsia), 2026 Technical Program Committee Member
The Fifth International Workshop on Big Data Reduction (IWBDR-5) in conjunction with 2025 IEEE International Conference on Big Data (IEEE BigData) Program Co-Chair
IEEE Transactions on Parallel and Distributed Systems (TPDS) Review Board Member
The 11th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-11) in conjunction with ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2025 Technical Program Committee Member
The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2025 Technical Program Committee Member
The Journal of Supercomputing - Springer Nature Reviewer
Journal of Parallel and Distributed Computing (JPDC) - Elsevier Reviewer
The 39th ACM International Conference on Supercomputing (ICS), 2025 Technical Program Committee Member
The 26th IEEE International Conferences on High Performance Computing and Communications (HPCC), 2024 Technical Program Committee Member
The Fourth International Workshop on Big Data Reduction (IWBDR-4) in conjunction with 2023 IEEE International Conference on Big Data (IEEE BigData) Technical Program Committee Member
Future Generation Computer Systems (FGCS) - Elsevier Reviewer
The first Workshop on Software and Hardware Co-Design of Deep Learning Systems in Accelerators (SHDA 2023) in conjunction with ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC) Technical Program Committee Member
IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) Finance Chair

Appointments

Assistant Professor, Department of Computer Science, Stevens Institute of Technology

Professional Societies

ASEE – American Society for Engineering Education Member
ACM – Association for Computing Machinery Member
IEEE – Institute of Electrical and Electronics Engineers Member

Grants, Contracts and Funds

Sole PI: NSF, CRII: OAC: A Compressor-Assisted Collective Communication Framework for GPU-Based Large-Scale Deep Learning 2024 – 2026
Site PI: DOE/ANL, Scalable and Resilient Modeling for Federated-Learning-Based Complex Workflows 2024 – 2026
Sole PI: Stevens SIAI, Efficient Communication Payload Reductions for Federated Learning 2024
Lead PI: ANL LDRD, Scalability Study of AI-based Surrogate for Ptychographic Image Reconstruction on Graphcore 2022

Selected Publications

Conference Proceeding

Pan, Y.; Lin, H.; Ran, Y.; Chen, J.; Yu, X.; Zhao, W.; Zhang, D.; Xu, Z.; Chiruzzo, L.; Ritter, A.; Wang, L. (2025). ALinFiK: Learning to Approximate Linearized Future Influence Kernel for Scalable Third-Parity LLM Data Valuation. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2025 - Volume 1: Long Papers, Albuquerque, New Mexico, USA, April 29 - May 4, 2025 (pp. 11756--11771). Association for Computational Linguistics.
https://doi.org/10.18653/v1/2025.naacl-long.589.
Sun, B.; Liu, W.; Pauloski, J. G.; Tian, J.; Jia, J.; Wang, D.; Zhang, B.; Zheng, M.; Di, S.; Jin, S.; Zhang, Z.; Yu, X.; Iskra, K. A.; Beckman, P.; Tan, G.; Tao, D. (2025). COMPSO: Optimizing Gradient Compression for Distributed Training with Second-Order Optimizers. Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, PPoPP 2025, Las Vegas, NV, USA, March 1-5, 2025 (pp. 212--224). ACM.
https://doi.org/10.1145/3710848.3710852.
Guo, W.; Long, J.; Zeng, Y.; Liu, Z.; Yang, X.; Ran, Y.; Gardner, J. R.; Bastani, O.; De Sa, C.; Yu, X.; Chen, B.; Xu, Z. (2025). Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity. The Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net.
https://openreview.net/forum?id=myYzr50xBh.
Yuan, L.; Ahmad, A.; Yan, D.; Han, J.; Adhikari, S.; Yu, X.; Zhou, Y. (2024). G 2-AIMD: A Memory-Efficient Subgraph-Centric Framework for Efficient Subgraph Finding on GPUs. 2024 IEEE 40th International Conference on Data Engineering (ICDE) (pp. 3164--3177). IEEE.
Huang, J.; Di, S.; Yu, X.; Zhai, Y.; Liu, J.; Jian, Z.; Liang, X.; Zhao, K.; Lu, X.; Chen, Z.; Cappello, F.; Guo, Y.; Thakur, R. (2024). hZCCL: Accelerating Collective Communication with Co-Designed Homomorphic Compression. Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2024, Atlanta, GA, USA, November 17-22, 2024 (pp. 104). IEEE.
https://dl.acm.org/doi/10.1109/SC41406.2024.00110.
Shah, M.; Yu, X.; Di, S.; Becchi, M.; Cappello, F.; Dazzi, P.; Mencagli, G.; Lowenthal, D. K.; Badia, R. M. (2024). A Portable, Fast, DCT-based Compressor for AI Accelerators. Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2024, Pisa, Italy, June 3-7, 2024 (pp. 109--121). ACM.
https://doi.org/10.1145/3625549.3658662.
Xie, Z.; Emani, M.; Yu, X.; Tao, D.; He, X.; Su, P.; Zhou, K.; Vishwanath, V.; Bagchi, S.; Zhang, Y. (2024). Centimani: Enabling Fast AI Accelerator Selection for DNN Training with a Novel Performance Predictor. Proceedings of the 2024 USENIX Annual Technical Conference, USENIX ATC 2024, Santa Clara, CA, USA, July 10-12, 2024 (pp. 1203--1221). USENIX Association.
https://www.usenix.org/conference/atc24/presentation/xie.
Song, S.; Huang, Y.; Jiang, P.; Yu, X.; Zheng, W.; Di, S.; Cao, Q.; Feng, Y.; Xie, Z.; Cappello, F.; Dazzi, P.; Mencagli, G.; Lowenthal, D. K.; Badia, R. M. (2024). CereSZ: Enabling and Scaling Error-bounded Lossy Compression on Cerebras CS-2. Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2024, Pisa, Italy, June 3-7, 2024 (pp. 309--321). ACM.
https://doi.org/10.1145/3625549.3658691.
Huang, J.; Di, S.; Yu, X.; Zhai, Y.; Liu, J.; Huang, Y.; Raffenetti, K.; Zhou, H.; Zhao, K.; Lu, X.; Chen, Z.; Cappello, F.; Guo, Y.; Thakur, R.; Kise, K.; Salapura, V.; Annavaram, M.; Varbanescu, A. L. (2024). gZCCL: Compression-Accelerated Collective Communication Framework for GPU Clusters. Proceedings of the 38th ACM International Conference on Supercomputing, ICS 2024, Kyoto, Japan, June 4-7, 2024 (pp. 437--448). ACM.
https://doi.org/10.1145/3650200.3656636.
Huang, J.; Di, S.; Yu, X.; Zhai, Y.; Zhang, Z.; Liu, J.; Lu, X.; Raffenetti, K.; Zhou, H.; Zhao, K.; Chen, Z.; Cappello, F.; Guo, Y.; Thakur, R. (2024). An Optimized Error-controlled MPI Collective Framework Integrated with Lossy Compression. IEEE International Parallel and Distributed Processing Symposium, IPDPS 2024, San Francisco, CA, USA, May 27-31, 2024 (pp. 752--764). IEEE.
https://doi.org/10.1109/IPDPS57955.2024.00072.
Zhang, C.; Sun, B.; Yu, X.; Xie, Z.; Zheng, W.; Iskra, K. A.; Beckman, P.; Tao, D. (2023). Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors. Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, SC-W 2023, Denver, CO, USA, November 12-17, 2023 (pp. 1757--1766). ACM.
https://doi.org/10.1145/3624062.3624257.
Huang, Y.; Di, S.; Yu, X.; Li, G.; Cappello, F. (2023). cuSZp: An Ultra-fast GPU Error-bounded Lossy Compression Framework with Optimized End-to-End Performance. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2023, Denver, CO, USA, November 12-17, 2023 (pp. 43:1--43:13). ACM.
https://doi.org/10.1145/3581784.3607048.
Zhang, B.; Tian, J.; Di, S.; Yu, X.; Feng, Y.; Liang, X.; Tao, D.; Cappello, F. (2023). FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Computing Applications on GPUs. Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2023, Orlando, FL, USA, June 16-23, 2023 (pp. 129--142). ACM.
https://doi.org/10.1145/3588195.3592994.
Shah, M.; Yu, X.; Di, S.; Lykov, D.; Alexeev, Y.; Becchi, M.; Cappello, F. (2023). GPU-Accelerated Error-Bounded Compression Framework for Quantum Circuit Simulations. IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023, St. Petersburg, FL, USA, May 15-19, 2023 (pp. 757--767). IEEE.
https://doi.org/10.1109/IPDPS54959.2023.00081.
Zhang, B.; Tian, J.; Di, S.; Yu, X.; Swany, M.; Tao, D.; Cappello, F. (2023). GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs. Proceedings of the 37th International Conference on Supercomputing, ICS 2023, Orlando, FL, USA, June 21-23, 2023 (pp. 348--359). ACM.
https://doi.org/10.1145/3577193.3593706.
Zhang, C.; Smith, S.; Sun, B.; Tian, J.; Soifer, J.; Yu, X.; Song, S. L.; He, Y.; Tao, D. (2023). HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs. Proceedings of the 37th International Conference on Supercomputing, ICS 2023, Orlando, FL, USA, June 21-23, 2023 (pp. 324--335). ACM.
https://doi.org/10.1145/3577193.3593717.
Shah, M.; Yu, X.; Di, S.; Becchi, M.; Cappello, F. (2023). Lightweight Huffman Coding for Efficient GPU Compression. Proceedings of the 37th International Conference on Supercomputing, ICS 2023, Orlando, FL, USA, June 21-23, 2023 (pp. 99--110). ACM.
https://doi.org/10.1145/3577193.3593736.
Rivera, C.; Di, S.; Tian, J.; Yu, X.; Tao, D.; Cappello, F. (2022). Optimizing Huffman Decoding for Error-Bounded Lossy Compression on GPUs. 2022 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2022, Lyon, France, May 30 - June 3, 2022 (pp. 717--727). IEEE.
https://doi.org/10.1109/IPDPS53621.2022.00075.
Yu, X.; Di, S.; Zhao, K.; Tian, J.; Tao, D.; Liang, X.; Cappello, F. (2022). Ultrafast Error-bounded Lossy Compression for Scientific Datasets. HPDC '22: The 31st International Symposium on High-Performance Parallel and Distributed Computing, Minneapolis, MN, USA, 27 June 2022 - 1 July 2022 (pp. 159--171). ACM.
https://doi.org/10.1145/3502181.3531473.
Yu, X.; Di, S.; Gok, A. M.; Tao, D.; Cappello, F. (2021). cuZ-Checker: A GPU-Based Ultra-Fast Assessment System for Lossy Compressions. IEEE International Conference on Cluster Computing, CLUSTER 2021, Portland, OR, USA, September 7-10, 2021 (pp. 307--319). IEEE.
https://doi.org/10.1109/Cluster48925.2021.00065.
Bicer, T.; Yu, X.; Ching, D. J.; Chard, R.; Cherukara, M. J.; Nicolae, B.; Kettimuthu, R.; Foster, I. T. (2021). High-Performance Ptychographic Reconstruction with Federated Facilities. Driving Scientific and Engineering Discoveries Through the Integration of Experiment, Big Data, and Modeling and Simulation - 21st Smoky Mountains Computational Sciences and Engineering, SMC 2021, Virtual Event, October 18-20, 2021, Revised Selected Papers (vol. 1512, pp. 173--189). Springer.
https://doi.org/10.1007/978-3-030-96498-6/_10.
Tian, J.; Di, S.; Yu, X.; Rivera, C.; Zhao, K.; Jin, S.; Feng, Y.; Liang, X.; Tao, D.; Cappello, F. (2021). Optimizing Error-Bounded Lossy Compression for Scientific Data on GPUs. IEEE International Conference on Cluster Computing, CLUSTER 2021, Portland, OR, USA, September 7-10, 2021 (pp. 283--293). IEEE.
https://doi.org/10.1109/Cluster48925.2021.00047.
Yu, X.; Bicer, T.; Kettimuthu, R.; Foster, I. T. (2021). Topology-aware optimizations for multi-GPU ptychographic image reconstruction. ICS '21: 2021 International Conference on Supercomputing, Virtual Event, USA, June 14-17, 2021 (pp. 354--366). ACM.
https://doi.org/10.1145/3447818.3460380.
Yu, X.; Wei, F.; Ou, X.; Becchi, M.; Bicer, T.; Yao, D. D. (2020). GPU-Based Static Data-Flow Analysis for Fast and Scalable Android App Vetting. 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, LA, USA, May 18-22, 2020 (pp. 274--284). IEEE.
https://doi.org/10.1109/IPDPS47924.2020.00037.
Yu, X.; Xiao, Y.; Cameron, K. W.; Danfeng (Daphne) Yao (2019). Comparative Measurement of Cache Configurations' Impacts on Cache Timing Side-Channel Attacks. 12th USENIX Workshop on Cyber Security Experimentation and Test, CSET 2019, Santa Clara, CA, USA, August 12, 2019. USENIX Association.
https://www.usenix.org/conference/cset19/presentation/yu.
Yu, X.; Wang, H.; Feng, W.; Gong, H.; Cao, G. (2017). An Enhanced Image Reconstruction Tool for Computed Tomography on CPUs. Proceedings of the Computing Frontiers Conference, CF'17, Siena, Italy, May 15-17, 2017 (pp. 97--106). ACM.
https://doi.org/10.1145/3075564.3078889.
Nourian, M.; Wang, X.; Yu, X.; Feng, W.; Becchi, M. (2017). Demystifying automata processing: GPUs, FPGAs or Micron's AP?. Proceedings of the International Conference on Supercomputing, ICS 2017, Chicago, IL, USA, June 14-16, 2017 (pp. 1:1--1:11). ACM.
https://doi.org/10.1145/3079079.3079100.
Yu, X.; Hou, K.; Wang, H.; Feng, W. (2017). Robotomata: A framework for approximate pattern matching of big data on an automata processor. 2017 IEEE International Conference on Big Data (IEEE BigData 2017), Boston, MA, USA, December 11-14, 2017 (pp. 283--292). IEEE Computer Society.
https://doi.org/10.1109/BigData.2017.8257936.
Yu, X.; Wang, H.; Feng, W.; Gong, H.; Cao, G. (2016). cuART: Fine-Grained Algebraic Reconstruction Technique for Computed Tomography Images on GPUs. IEEE/ACM 16th International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2016, Cartagena, Colombia, May 16-19, 2016 (pp. 165--168). IEEE Computer Society.
https://doi.org/10.1109/CCGrid.2016.96.
Yu, X.; Feng, W.; Danfeng (Daphne) Yao; Becchi, M. (2016). O3FA: A Scalable Finite Automata-based Pattern-Matching Engine for Out-of-Order Deep Packet Inspection. Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems, ANCS 2016, Santa Clara, CA, USA, March 17-18, 2016 (pp. 1--11). ACM.
https://doi.org/10.1145/2881025.2881034.

Journal Article

Di, S.; Liu, J.; Zhao, K.; Liang, X.; Underwood, R.; Zhang, Z.; Shah, M.; Huang, Y.; Huang, J.; Yu, X.; Ren, C.; Guo, H.; Wilkins, G.; Tao, D.; Tian, J.; Jin, S.; Jian, Z.; Wang, D.; Rahman, M. H.; Zhang, B.; Song, S.; Calhoun, J.; Li, G.; Yoshii, K.; Alharthi, K. A.; Cappello, F. (2025). A Survey on Error-Bounded Lossy Compression for Scientific Datasets. ACM Comput. Surv. (11 ed., vol. 57, pp. 287:1--287:38).
https://doi.org/10.1145/3733104.
Zhang, C.; Ding, X.; Sun, B.; Yu, X.; Zheng, W.; Xie, Z.; Tao, D. (2024). GFormer: Accelerating Large Language Models with Optimized Transformers on Gaudi Processors. CoRR (vol. abs/2412.19829).
https://doi.org/10.48550/arXiv.2412.19829.
Yu, X.; Nikitin, V.; Ching, D. J.; Aslan, S.; Gursoy, D.; Bicer, T. (2022). Scalable and accurate multi-GPU-based image reconstruction of large-scale ptychography data. Scientific reports (1 ed., vol. 12, pp. 5334). Nature Publishing Group UK London.
Sun, B.; Yu, X.; Zhang, C.; Tian, J.; Jin, S.; Iskra, K.; Zhou, T.; Bicer, T.; Beckman, P.; Tao, D. (2022). SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates. CoRR (vol. abs/2211.00224).
https://doi.org/10.48550/arXiv.2211.00224.
Yu, X.; Wang, H.; Feng, W.; Gong, H.; Cao, G. (2019). GPU-Based Iterative Medical CT Image Reconstructions. J. Signal Process. Syst. (3-4 ed., vol. 91, pp. 321--338).
https://doi.org/10.1007/s11265-018-1352-0.
Yu, X.; Lin, B.; Becchi, M. (2014). Revisiting State Blow-Up: Automatically Building Augmented-FA While Preserving Functional Equivalence. IEEE J. Sel. Areas Commun. (10 ed., vol. 32, pp. 1822--1833).
https://doi.org/10.1109/JSAC.2014.2358840.

Courses

CS 382: Computer Architecture and Organization
CS/CpE 550: Computer Organization and Programming
CS 810: Special Topics in CS: Modern Parallel and Distributed Computing on Cluster and Super-Computers

Academics

Undergraduate Study

Discover Stevens

The Innovation University

Student Life

New Students

The Stevens Experience

Supporting Your Journey

Research

Admission & Aid

Undergraduate Admissions

Graduate Admissions

Tuition and Financial Aid

Veterans and Military

Xiaodong Yu

Assistant Professor

Education

Research

General Information

Experience

Institutional Service

Professional Service

Appointments

Professional Societies

Grants, Contracts and Funds

Selected Publications

Conference Proceeding

Journal Article

Courses