Research Seminar: LLM360: Towards Fully Transparent Open-Source LLMs

Abstract image of AI

Department of Electrical and Computer Engineering

Location: Burchard, Room 714

Speaker: Dr. Hongyi Wang, GenBio.ai

ABSTRACT

The recent surge in "open-weight" Large Language Models (LLMs), such as LLaMA, Falcon, and Mistral, provides diverse options for AI practitioners and researchers. However, most LLMs have only released partial artifacts, such as final model weights or inference code, and technical reports increasingly limit their scope to high-level design choices and surface statistics. These choices hinder progress in the field by degrading transparency in the training of LLMs and forcing teams to rediscover many details of the training process. In this talk, I will introduce LLM360, an initiative to fully open-source LLMs, advocating for the availability of all training code and data, model checkpoints, and intermediate results to the community. The goal of LLM360 is to support open and collaborative AI research by making the end-to-end LLM training process transparent and reproducible for everyone. I will present our fully open and transparent LLM pre-training dataset TxT360, as well as existing LLMs pre-trained from scratch, which achieve leading benchmark performance.

BIOGRAPHY

Portrait of Hongyi Wang

Dr. Hongyi Wang is currently the Head of Infrastructure at GenBio.ai, a startup dedicated to using foundation models to solve fundamental life science problems. Prior to that, he spent two years as a postdoctoral fellow at CMU, working with Prof. Eric Xing. He will join the CS department at Rutgers University in Fall 2025 as a tenure-track Assistant Professor. His research focuses on large-scale machine-learning algorithms and systems. He obtained his Ph.D. degree from the Department of Computer Sciences at the University of Wisconsin-Madison. Dr. Wang received the Rising Stars Award from the Conference on Parsimony and Learning in 2024, the NAACL 2024 Best Demo Award runner-up, and the Baidu Best Paper Award at the SpicyFL workshop at NeurIPS 2020.