Publications: David Mguni

Yan X, Song Y, Cui X, Christianos F, Zhang H, Wang J, Mguni D ( 2025 ) . A Bilevel Reinforcement Learning Framework with Language Prior Knowledge . Machine Learning and Knowledge Discovery in Databases. Research Track , vol. 16018 , Springer Nature

https://qmro.qmul.ac.uk/xmlui/handle/123456789/125129

Schäfer L, Slumbers O, McAleer S, Du Y, Albrecht SV, Mguni D ( 2025 ) . Ensemble Value Functions for Efficient Exploration in Multi-Agent Reinforcement Learning . Autonomous agents and multiagent systems . Conference: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 11849 - 1857 .

10.65109/rioa5189

https://qmro.qmul.ac.uk/xmlui/handle/123456789/125125

Jafferjee T, Ziomek J, Yang T, Dai Z, Wang J, Taylor ME, Shao K, Wang J et al. ( 2025 ) . Taming Multi-Agent Reinforcement Learning with Estimator Variance Reduction . Autonomous agents and multiagent systems . Conference: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 11042 - 1050 .

10.65109/igaq8141

https://qmro.qmul.ac.uk/xmlui/handle/123456789/125124

Dinh LC, Mguni D, Tran-Thanh L, Yang Y ( 2024 ) . A Summary of Online Markov Decision Processes with Non-oblivious Strategic Adversary . Conference: International Conference on Autonomous Agents and Multiagent Systems

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100865

Li H, Huang W, Mguni D, Shao K ( 2023 ) . A survey on algorithms for Nash equilibria in finite normal-form games . Computer Science Review

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100169

Feng X, Luo Y, Wang Z, Yang M, Du Y ( 2023 ) . ChessGPT: Bridging Policy Learning and Language Modeling . Conference: Conference on Neural Information Processing Systems

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100866

Mguni D, Jafferjee T, Wang J, Perez-Nieves N, Song W, Taylor M, Yang T, Zhu J ( 2023 ) . Learning to Shape Rewards using a Game of Two Partners . Conference: Association for the Advancement of Artificial Intelligence

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100867

Slumbers O, Mguni D, Blumberg S, McAleer S, Wang J ( 2023 ) . A game-theoretic framework for managing risk in multi-agent systems . Conference: International Conference on Machine Learning

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100869

Mguni D, Chen H, Jafferjee T, Wang J, Yue L, McAleer S ( 2023 ) . MANSA: Learning Fast and Slow in Multi-Agent Systems . Conference: International Conference on Machine Learning

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100868

Mguni D, Sootla A, Ziomek J, Slumbers O, Dai Z, Shao K ( 2023 ) . Timing is Everything: Learning to Act Selectively with Costly Actions and Budgetary Constraints . Conference: International Conference on Learning Representations

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100870

Dinh LC, Mguni D, Tran-Thanh L ( 2023 ) . Online Markov Decision Processes with Non-oblivious Strategic Adversary . Autonomous Agents and Multi-Agent Systems

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100170

Mguni D, Deng X, Li N, Mguni D ( 2022 ) . On the complexity of computing Markov perfect equilibrium in general-sum stochastic games . National Science Review

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100167

Dinh LC, McAleer S, Tian Z, Slumbers O, Mguni D, Wang J ( 2022 ) . Online double oracle . Transactions on Machine Learning Research

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100864

Dai Z, Zhou T, Shao K, Mguni D, Wang B ( 2022 ) . Socially-Attentive Policy Optimization in Multi-Agent Self-Driving System . Conference: Conference on Robot Learning

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100878

Mguni D, Chen Y, Deng X, Wang J ( 2022 ) . On the Convergence of Fictitious Play: A Decomposition Approach . Conference: International Joint Conference on Artificial Intelligence

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100841

Mguni D, Sootla A, Cowen-Rivers A, Jafferjee T, Wang Z ( 2022 ) . SAUTE RL: Almost Surely Safe Reinforcement Learning Using State Augmentation . Conference: International Conference on Machine Learning

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100872

Mguni D, Jafferjee T, Wang J, Perez-Nieves N, Tong F, Li Y, Zhu J ( 2022 ) . LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning . Conference: International Conference on Learning Representations

https://qmro.qmul.ac.uk/xmlui/handle/123456789/100871

Mguni D, Perez Nieves N, Wang J . Apparatus and method for automated reward shaping . no. 18365818 ,

Mguni D, Sun Y, Chen H, Yang W, Darabi A, Orimoloye L, Yang Y . Learning Robust Multi-Agent Policies via Selective Adversarial Fault Induction . Conference: Proceedings of the 43rd International Conference on Machine Learning, Seoul, South Korea. PMLR 306, 2026

Global main menu

Areas of study

Study at Queen Mary

Experience Queen Mary

Research and Innovation

Research by faculties and centres

Collaborations and partnerships

Publications: DR David Mguni