Socio-Structural AI · Multi-Agent Coordination

Reliable cooperation emerges from invisible structures.

Scaling decentralized teams to 50 agents by designing the constraints, incentives, and information topologies that guide interaction. This page highlights the architecture, analysis, and episodic outcomes behind our ICRA 2026 submission on Graph Attention MARL (GAT-MA).

Graph-restricted policies Topology-aware safety Prosocial norms Vision-language grounding
Graph Attention MARL architecture overview

Research thread

Reliability in decentralized fleets comes from shaping the communication graph and structural constraints rather than micromanaging agents.

We evaluate GAT-MA against MAPPO across 3–50 agent teams, probing attention topology, scaling behavior, and safety envelopes.

Paper snapshot

ICRA 2026 submission: Understanding Graph Attention for Learning Multi-Agent Coordination.

Three directions: safe continual coordination, grounded spatial communication with VLMs, and emergent prosocial norms for heterogeneous populations.

Read the paper (PDF)

Cite as:
Wei-Han Tu, Ya-Chien Chang, and Sicun Gao. (2026). "Understanding Graph Attention for Learning Multi-Agent Coordination." IEEE International Conference on Robotics and Automation 2026 [In Submission].

Design lens

Inspired by urban design: we engineer the “roads” of information flow so large teams stay coherent under partial observability and changing dynamics.

Architecture and topology

GAT-MA policy head with attention graph
Graph attention head imposes a sparse, efficient information topology that scales beyond 30 agents.
Communication network comparison between MAPPO and GAT-MA
Communication patterns: GAT-MA learns focused corridors while MAPPO diffuses, enabling faster consensus.

Scaling behavior

Topology evolution across 5 to 50 agents
Topology evolution from 5 to 50 agents: hubs emerge to stabilize global intent while preserving locality.
Coordination comparison at 50 agents
50-agent coordination: GAT-MA maintains structured flow under congestion where baselines fragment.

Episode gallery (GIFs)

Side-by-side outcomes across scales. Left-to-right progression shows how graph-structured communication keeps teams coherent as population grows.

3 agents GAT-MA success
3 agents · GAT-MA
5 agents GAT-MA success
5 agents · GAT-MA
8 agents GAT-MA win
8 agents · GAT-MA
12 agents GAT-MA win
12 agents · GAT-MA
15 agents GAT-MA win
15 agents · GAT-MA
30 agents GAT-MA success
30 agents · GAT-MA
50 agents GAT-MA success
50 agents · GAT-MA
3 agents failure baseline
3 agents · failure
5 agents MAPPO success
5 agents · MAPPO
8 agents MAPPO success
8 agents · MAPPO
12 agents MAPPO win
12 agents · MAPPO
15 agents MAPPO win
15 agents · MAPPO
30 agents MAPPO win
30 agents · MAPPO
50 agents MAPPO success
50 agents · MAPPO

Analyses

Attention value distributions
Attention value distributions show sparse, high-signal routing compared to dense baselines.
Computational complexity summary
Computational complexity remains manageable by enforcing locality and bounded degree in the graph.

Artifacts

Key figures are mirrored under assets/img/ for the site. Repository sources remain in docs/paper/assets/.