Applying Large Language Models to Financial Decision-Making
Team of Stevens Ph.D. students is part of a worldwide collaboration exploring the use of AI in helping financial managers be more strategic
From customer service chatbots to posts on social media, the rise of large language models (LLMs) has transformed countless aspects of daily life. Now with the help of research being done by a group of Stevens Institute of Technology doctoral students, these powerful systems are being tested in the deeply human-centered domain of financial decision-making.
What was once the exclusive purview of seasoned analysts and portfolio managers is now augmented by the lightning-fast processing and pattern recognition capabilities of these AI technologies.
Partnering with an interdisciplinary research team from across the globe, Ph.D. candidates Yangyang Yu (data science), Haohang Li (data science) and Yupeng Cao (electronic and computer engineering) have focused on the development of LLM agents and their financial applications. At Stevens, the team was advised by Dr. Jordan Suchow, an assistant professor in the School of Business, who helped oversee the project and provided valuable resources and suggestions.
“We are a team of researchers that has a very mixed background,” Yu said. “Haohang and I were engineering students, but because the School of Business offers a data science program, we have some connection. This is very collaborative research that tries to build a bridge from computer science to traditional financial engineering applications. Together, we have been able to fuse these ideas.”
The group’s research has resulted in three distinct projects so far.
Project No. 1: FinMem
The first project, FinMem, involved designing an LLM agent with a memory system that mimics human cognition. The agent reads market information and attempts to pull out key insights while ignoring unimportant data through a memory feedback loop.
According to the paper, “This framework enables the agent to self-evolve its professional knowledge, react agilely to new investment cues and continuously refine trading decisions in the volatile financial environment.”
The project was accepted for presentation at the International Conference on Learning Representations LLM Agent Workshop and is currently under review by the Institute of Electrical and Electronics Engineers (IEEE) Transactions on Big Data.
Project No. 2: FinCon
Building on FinMem, the group’s second project, FinCon, implements a “multi-agent framework.” The various machine agents aggregate information from sources such as news, industry reports, earnings calls, etc., and deliver it to a central agent, which then makes trading decisions. The central agent also gives feedback to the other agents to improve their performance. According to Li, “We found this framework to be more data-efficient than traditional reinforcement learning agents in trading tasks, and it demonstrated state-of-the-art performance compared to other LLM-based trading agent frameworks.”
The work has been accepted for presentation at the NeurIPS 2024 main conference.
“One of the reasons this was accepted is that we are the first to use language agents to conduct quite complex financial decision-making tasks, such as a portfolio management,” Li said. “When we started, we didn’t know which kind of collaborative agent structure actually applied to this case because no one has done this kind of work, or at least very little had been published. We worked together to do a lot of explanation. We had multiple trials. Over the course of seven months, there were some failures, but eventually, we figured it out.”
A risk-control component in the system helps improve the quality of financial decisions. This component occasionally initiates a self-evaluation process to update the system’s underlying investment beliefs. These updated beliefs are then used to reinforce the future behavior of the AI agent, and the relevant beliefs are shared with the appropriate parts of the system that need that knowledge. This feature notably enhances the system’s performance while also reducing unnecessary communication between different components of the system.
Project No. 3: FinBen
The third project to come out of the research is FinBen, the first extensive open-source evaluation benchmark for language-based financial tasks. It includes 36 datasets covering 24 financial tasks in seven key areas: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting and decision-making.
The results of testing popular AI models like GPT-4, ChatGPT and Gemini revealed that while these systems were good at straightforward tasks like extracting specific information from financial documents and analyzing written content, they still had issues with more complex tasks like making market predictions and generating comprehensive reports. In a recent AI competition, several teams developed solutions with FinBen that outperformed GPT-4, showing the promising potential of AI in finance.
“The FinBen project has developed a massive benchmark system,” Li said. “One of the important things for developing an artificial intelligence system is that it’s not only about the model. The second issue is how you can measure it so people understand its performance. How does the model behave, and how does it compare with others in this environment? The benchmark system we’re creating is the work of a huge research team from Stevens, Yale and the University of Manchester. I would say the most important contribution is that we provide a comprehensive benchmark for all people to facilitate the development of a large language model in the finance area.”
Similar to FinCon, the FinBen project has also been accepted for presentation at the NeurIPS 2024 main conference.
The Next Steps
The research team plans to continue building on their work and develop an end-to-end framework that can handle an even broader range of financial decision-making functions beyond the current focus on trading and optimization. Future plans include addressing tasks like credit ratings and insurance underwriting.
There has already been strong interest from industry in applying their single-agent trading system to live investment strategies. Firms have been in contact to explore using the system for their own trading activities. Beyond trading, the team's expertise is also being sought out for broader contributions to the financial AI and mathematical finance community, including invitations to give talks and collaborate on book chapters. These partnerships will allow the team to leverage significant computational resources while also ensuring their work aligns with real-world industry needs and standards.
“Before I was invited to join the team, we were doing this research and creating solutions individually,” Cao said. “It’s important to share specific knowledge in one area with the different domains. I'm a natural language processing guy, and language models and chat are very popular in society, but I had never thought about how to combine the large language model with cognitive science to make something happen in the financial domain. I never imagined that before, so it’s triggered ideas about future research studies. How can I better collaborate with different domains and combine the different knowledge to make things happen?”