Dissertation
Communication and Generalization in Multi-Agent Learning
Abstract
Multi-agent learning aims to allow artificial intelligence (AI) agents to learn from interactions with other agents in an environment. However, as AI increasingly integrates into real-world systems, significant challenges arise in how to robustly interact with and communicate with a variety of other agents, particularly in complex environments such as autonomous driving, where humans and AI agents coexist. This dissertation research investigates how agents can be trained to effectively communicate with and generalize to diverse partners (including humans) in simulated real-world scenarios.
Towards addressing this challenge, this dissertation explores three key dimensions: (1) learning communication-supporting representations that facilitate coordination, (2) developing multi-agent policies that generalize to new teammates or opponents, and (3) learning to collaborate with human-like agents or to use human language. This dissertation makes novel contributions along each dimension.
First, the dissertation presents Coopernaut, a framework that learns compact, transmittable representations from local observations to support communication among autonomous vehicles under bandwidth constraints. It also introduces LLM+Debrief, which enables embodied agents to coordinate in driving scenarios by generating and interpreting natural language messages, paving the way for human-compatible agent communication.
Second, it introduces MACTA, a reinforcement learning and game-theoretic training framework that produces robust policies capable of generalizing to unseen and adaptive opponents. In addition, L-BRDiv is introduced as a teammate generation strategy that promotes behavioral diversity during training, improving generalization and performance in ad hoc teamwork settings.
Third, the dissertation investigates mixed-autonomy traffic coordination through decentralized training in environments with both human-proxy and AI agents. Empirical results demonstrate that even a small number of trained autonomous vehicles can collaborate effectively to influence human behavior and improve overall traffic efficiency without requiring centralized control.
Collectively, these contributions advance multi-agent AI by unifying communication, generalization, and human–AI collaboration. Evaluated in both toy domains and realistic simulated environments, primarily focusing on autonomous driving and hardware security, the work demonstrates how agents can adapt to novel partners and communicate effectively in human-interpretable ways.
Structure
Background
Motivation, contributions, and the formal foundations used throughout the dissertation — MDPs, partially observable stochastic games, agent populations, and learning objectives.
- Ch. 1 Introduction
- Ch. 2 Background and Notation
Learning to Communicate
From shared latent representations across networked vehicles to explicit natural-language messages between agents driven by large language models.
Learning to Generalize
Training pipelines and theoretical tools that produce agents which remain robust when deployed with unseen co-players — both adversarial opponents and cooperative teammates.
Learning with Human Proxies
Centralized, modular-transfer, and distributed multi-agent driving policies that coordinate with rule-based human-proxy traffic to dissolve stop-and-go congestion in mixed-autonomy highways.
Related and Future Work
Connections to the broader literature, open problems, and a roadmap toward super-human Pokémon AI, open-ended ad hoc teamwork, multi-agent strategic reasoning for LLMs, and multi-agent collaboration safety.
- Ch. 8 Related Work
- Ch. 9 Future Work
- Ch. 10 Conclusion