Ziyan Wang

Ziyan Wang

Ph.D. Candidate · Cooperative AI Lab · King's College London

Email: ziyan.wang[at]kcl[dot]ac[dot]uk

Research overview

I am a fourth-year Ph.D. candidate at the Cooperative AI Lab, King's College London, supervised by Dr Yali Du and Prof. Sanjay Modgil. My work studies how learning agents can coordinate, communicate, and act safely in complex environments.

RL and MARL. I develop learning methods for agents that coordinate under feedback, constraints, and strategic interaction, including M3HF, MACCA, Fisher Decorator, policy learning from tutorial books, GRD, and MACPO.

Multi-LLM agents. I study how language-model agents communicate, reason, remember, and generalize in social or open-ended environments, including Werewolf, ChessGPT, Concordia, BazaarBench, instruction relabeling, and confidence-competence alignment.

Safe and aligned autonomy. I build agents that can follow human intent and respect safety constraints, from SMALL and safe RL with free-form natural-language constraints to Saute RL, safe MARL benchmarks, and current work on multi-agent LLM monitoring.

I am currently an Oxford IDAI Fellow working with Dr Adel Bibi and Prof. Philip Torr, and a research intern in the Future AI Group at Microsoft Research Cambridge, working with Dr Kirill P. Kalinin. I have also visited Carnegie Mellon University with Prof. Fei Fang and worked with Microsoft Research's AI Frontier Group in Redmond.

News

May 2026 Started a research internship at Microsoft Research Cambridge, focusing on multi-agent LLM communication and coordination.
Apr 2026 Memento is now online, exploring how LLMs can manage their own context.
Feb 2026 Started the Oxford IDAI Fellowship at the University of Oxford, working with Dr Adel Bibi and Prof. Philip Torr.
Nov 2025 SMALL has been accepted to AAAI AIA 2026! See you in Singapore!
Sep 2025 Starting a Research Internship at Microsoft in Redmond, focusing on LLM reasoning
Sep 2025 One Paper has been accepted to NeurIPS2025!

Experience & Visits

Research Internship, Future AI Group

Microsoft Research Cambridge, Cambridge, UK · May 2026 - present

Working with Dr Kirill P. Kalinin on multi-agent LLM communication, coordination, and collaborative agent behavior.

Oxford IDAI Fellowship

University of Oxford, Oxford, UK · Feb. 2026 - present

Working with Dr Adel Bibi and Prof. Philip Torr on real-time multi-agent LLM anomaly detection and monitoring.

Research Internship, AI Frontier Group

Microsoft Research, Redmond, US · Sep. 2025 - Dec. 2025

Worked with Vaishnavi Shrivastava and Prof. Dimitris Papailiopoulos on LLM pre-training and reasoning.

Visiting Ph.D. Student

Carnegie Mellon University, Pittsburgh, US · Feb. 2025 - Jun. 2025

Visited Prof. Fei Fang's group, working on multi-agent learning and AI for social impact.

Selected Publications

* equal contribution, corresponding author

  1. Main figure for Safe Multi-agent Reinforcement Learning with Natural Language Constraints
    AAAI'26

    Safe Multi-agent Reinforcement Learning with Natural Language Constraints

    Ziyan Wang , Meng Fang , Tristan Tomilin , Fei Fang and Yali Du
    Alignment Track of the 40th Annual AAAI Conference on Artificial Intelligence (AAAI) 2026
  2. Main figure for M3HF: Multi-agent Reinforcement Learning from Multi-phase Human Feedback of Mixed Quality
    ICML'25

    M3HF: Multi-agent Reinforcement Learning from Multi-phase Human Feedback of Mixed Quality

    Ziyan Wang , Zhicheng Zhang , Fei Fang and Yali Du
    Forty-Second International Conference on Machine Learning (ICML) 2025
  3. Main figure for MACCA: Offline Multi-agent Reinforcement Learning with Causal Credit Assignment
    TMLR

    MACCA: Offline Multi-agent Reinforcement Learning with Causal Credit Assignment

    Ziyan Wang , Yali Du , Yudi Zhang , Meng Fang and Biwei Huang
    Transactions on Machine Learning Research (TMLR) 2025
  4. Oral Main figure for Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting
    NeurIPS'24

    Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting

    Xiong-Hui Chen* , Ziyan Wang* , Yali Du , Shengyi Jiang , Meng Fang , Yang Yu and Jun Wang
    The Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS) 2024
  5. Main figure for Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf
    NeurIPS'24

    Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf

    Xuanfa Jin* , Ziyan Wang* , Yali Du , Meng Fang , Haifeng Zhang and Jun Wang
    The Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS) 2024
  6. Main figure for ChessGPT: Bridging Policy Learning and Language Modeling
    NeurIPS'23

    ChessGPT: Bridging Policy Learning and Language Modeling

    Xidong Feng , Yicheng Luo , Ziyan Wang , Hongrui Tang , Mengyue Yang , Kun Shao , David Mguni , Yali Du and Jun Wang
    The Thirty-Seventh Annual Conference on Neural Information Processing Systems (NeurIPS) 2023

View all publications

Honors & Teaching

  • Honors: Oxford IDAI Fellowship, NeurIPS 2024 Scholar Award, NeurIPS 2024 Oral Presentation, King’s PhD Scholarship, Hungarian State Scholarship
  • Teaching: Oxford Machine Learning Summer School, Oxford MLx Fundamentals Summer School, and Optimisation Methods at King’s College London

Professional Services

  • Conference reviewer for ICML 2023/24/25/26, NeurIPS 2023/24/25/26, ICLR 2024/25/26, AISTATS 2025/26, and AAMAS 2025/26
  • Journal reviewer for IEEE Robotics and Automation Letters, IEEE Transactions on Knowledge and Data Engineering, and IEEE Transactions on Artificial Intelligence