Zhiyang Deng, Ph.D. Candidate in Financial Engineering
Bio
Zhiyang Deng is currently a Ph.D. candidate in financial engineering at Stevens Institute of Technology whose research interests include Risk-averse Stochastic Control, Mean Field Games, and Deep Reinforcement Learning. His research also encompasses FinTech, notably focusing on large language models (LLMs) for financial applications.
Skillset
He is an expert in mean field games, control & reinforcement learning and quantitative finance with advanced proficiency in Python, C++ and SQL. Zhiyang is experienced in developing quantitative models for financial markets, with a focus on optimization techniques.
Dissertation Summary
Risk-Sensitive Stochastic Differential Games and Learning Algorithms
Introduction
In the realm of stochastic optimization, the stochastic optimal control problem [Pha09] has emerged as a cornerstone of research, with wide-ranging applications in diverse field such as engineering, economics, finance, operation research and robotics. In many cases, the ob- jective of this problem is to study some dynamic systems driven by stochastic differential equations (SDEs) that can be controlled, and to determine the optimal control strategy for this system in order to optimize some performance criterion (usually, either to maximize reward functional, or to minimize cost functional.The classical stochastic optimal control above is the optimization problem involving with merely single decision maker, called agent. Then, “game theory is to deal with strategic interactions among multiple agents, who jointly try to optimize their own objective func- tional.” [BZ18] In most nontrivial cases, each agent cannot simply determine his/her own control strategy independently of the decision-makings of the other agents. Thus, a coupling relation between the controls of agents is brought up, which leads to a non-cooperative (or competitive) game. On another hand, if we assume that there exists a central governor who makes all agents collaborative so that the selection of controls is donecollectively for achiev- ing a social optimum, it brings out cooperative game theory. As an extension of classical stochastic optimal control, I will focus on stochastic differential games in my dissertation.
Essay 1: Risk-Averse Mean Field Games with g-Expectations
My first essay investigates a risk-averse variant of continuous-time mean field games (MFGs) [LL07, HMC06, HCM07],where agents’ risk preferences are modeled using time-consistent dynamic risk measures induced by g-expectations[CHMP02, Jia08]. By incorporating agents’ risk aversion into the MFG framework, we provide insights into how risk affects agents’ strategic decisions and the resulting Nash equilibria. In addition, we establish a theoreti- cal connection between risk-averse MFGs and distributionally robust optimization [RM19], demonstrating their significance in formulating robust control strategies that can withstand environmental ambiguities. This study offers acomprehensive understanding of the interplay between risk aversion and strategic decision-making in MFGs, highlighting its importance for developing resilient controls in uncertain environments.
Essay 2: Mckean-Vlasov Type Principal-Agent Problem and Its Application to Contract Theory
My second essay explores a McKean-Vlasov type Principal-Agent problem with moral haz- ard, extending the model in the spirit of [CPT18]. In our extension, the principal’s objective is formulated through a functional involving the joint distribution of the output process and the agent’s optimal value process. By incorporating dynamic programming methods from mean-field type control problems [LP14], we are able to demonstrate that the principal’s problem can besolved via a coupled system of Hamilton-Jacobi-Bellman and Fokker-Planck equations. In particular, we also showedthat our approach can effectively address risk-averse Principal-Agent problems under distortion risk measures, such as conditional value at risk (CVaR) and entropic risk measures.
Essay 3: Mean Field Multi-Agent Reinforcement Learning with Dynamic Risk Measures
My third essay introduces the risk-sensitive mean field MARL, an approach designed to address the scalability limitations of traditional multi-agent reinforcement learning (MARL) methods [ZYB21]. As the number of agents increases, learning becomes computationally challenging due to the curse of dimensionality and the complexity ofagent interactions. Our approach approximates interactions through a mean-field framework, where individual agents learn optimal policies by interacting with the average effect of the population or neighboring agents. We incorporatedynamic risk measures into the learning process [Rus10], allowing agents to balance both expected rewards and risk.Our method employs a time- consistent dynamic programming principle to evaluate policies, using dynamic riskmeasures to assess random processes. We develop policy gradient update rules, along with an actor- critic algorithm that leverages neural networks to optimize risk-sensitive policies.
References
[BZ18] T. Ba¸sar and G. Zaccour. Handbook of Dynamic Game Theory. Springer International Publish- ing, 2018.
[CHMP02] Fran¸cois Coquet, Ying Hu, Jean M´emin, and Shige Peng. Filtration-consistent nonlinear expec- tations and related g-expectations. Probability Theory and Related Fields, 123(1):1–27, 2002.
[CPT18] Jakˇsa Cvitani´c, Dylan Possama¨ı, and Nizar Touzi. Dynamic programming approach to principal– agent problems. Finance and Stochastics, 22:1–37, 2018.
[HCM07] Minyi Huang, Peter Caines, and Roland Malhame. Large-population cost-coupled lqg problems with nonuniform agents:Individual-mass behavior and decentralized -nash equilibria. Automatic Control, IEEE Transactions on, 52:1560 – 1571, 10 2007.
[HMC06] Minyi Huang, Roland Malhame, and Peter Caines. Large population stochastic dynamic games: Closed-loop mckean-vlasov systems and the nash certainty equivalence principle. Commun. Inf. Syst., 6, 01 2006
.[Jia08] Long Jiang. Convexity, translation invariance and subadditivity for g-expectations and related risk measures. 2008.
[LL07] Jean-Michel Lasry and Pierre-Louis Lions. Mean field games. Japanese Journal of Mathematics, 2:229–260, 03 2007.
[LP14] Mathieu Lauri`ere and Olivier Pironneau. Dynamic programming for mean-field type control. Comptes Rendus Mathematique, 352(9):707–713, 2014.
[Pha09] Huyˆen Pham. Continuous-time stochastic control and optimization with financial applications / Huyen Pham. 2009.
[RM19] Hamed Rahimian and Sanjay Mehrotra. Distributionally robust optimization: A review. arXiv preprint arXiv:1908.05659, 2019.
[Rus10] Andrzej Ruszczyn´ski. Risk-averse dynamic programming for markov decision processes. Mathe- matical programming, 125:235–261, 2010.
[ZYB21] Kaiqing Zhang, Zhuoran Yang, and Tamer Ba¸sar. Multi-agent reinforcement learning: A selective overview of theories andalgorithms. Handbook of reinforcement learning and control, pages 321– 384, 2021.
Academic Advisor
Zhenyu Cui