Tian Han Granted $600K NSF CAREER Award to Explore Generative AI
Stevens assistant professor of computer science is developing more powerful models of machine learning
As big data and artificial intelligence (AI) become more ingrained in our society, one of the biggest challenges becomes ensuring we can get relevant, useful results from all that information. How can we teach machines to understand and work with complex data like images, text, audio and videos — and use that understanding to create accurate, reliable content that makes a difference in the world?
Enter generative AI, which leverages advanced learning models that use all that data to generate amazing content such as articulate text and realistic images and videos. The path to perfection in this area is still strewn with obstacles, but through a five-year, $599,729 grant from the National Science Foundation (NSF) CAREER Award program, Tian Han, assistant professor of computer science at Stevens, aims to overcome three of the most significant hurdles in those existing models.
“One of the biggest challenges is handling complex, high-dimensional data such as images, text and videos,” Han explained. “Even a small 100 x 100-pixel color image of a face has rich structural information, like the overall facial structure and details such as eyes, mouth, skin and hair. Language is organized in layers, from topics to paragraphs, sentences and words. Video data could be events or actions or poses. Existing generative AI models can be inefficient and ineffective, overlooking key structural information within that data.”
Han further explained that users can’t easily control the output many generative AI models deliver, leading to biased or even inappropriate results that raise ethical and safety concerns. And when people train those models, they typically only consider and use one type of data, such as text data for a text generative model. It’s a limiting approach that restricts what the models can do.
With his CAREER grant, Han is developing more versatile models that recognize context clues, are easier to control, and are multimedia-friendly.
“First, we will work to extract useful information from high-dimensional data such as images or text to learn structured knowledge for smart generative AI models,” he said. “Then we will create controllable models that allow users to adjust both the models and the results, such as tweaking the intensity of a smile or adding a hat in a photo. Finally, we will design general, flexible frameworks that can simultaneously process data from different domains, such as words and images, to deliver consistent content such as image-captioning and even lip-reading. It’s all about helping AI learn more effectively to generate more useful results.”
The grant also includes funding to train undergraduate, graduate and Ph.D. students. Research topics and findings will be used to further strengthen the AI curriculum at Stevens to continue to help lead the future of generative AI.
Better AI, better data management, better results — better lives
Han has long been interested in how machines understand data, and he’s fascinated by the unknowns of how machines perceive the world and how they can learn to think and behave like humans to enrich our quality of life in a variety of ways.
“Generative AI has profound potential, and the success of this project could enable transformative capabilities such as image generation and manipulation, natural language processing, drug discovery and system security,” he said. “Additionally, it can gather information from different sources to fill in the gaps when data is scarce, which has exciting potential to find life-saving solutions in areas like medical diagnoses. It could even give robots a clearer structure to understand tasks and environments so they could plan and act autonomously.”