OpenCDA-MARL API Overview¶
OpenCDA-MARL extends the OpenCDA framework with Multi-Agent Reinforcement Learning capabilities for autonomous driving research. This API documentation covers the MARL-specific components and interfaces built on top of the OpenCDA foundation.
Implementation Status
Core Framework: Complete - Architecture, RL algorithms, training infrastructure, and evaluation tools are all implemented and ready for research use.
Architecture & Implementation¶
opencda_marl/
├── coordinator.py # Central MARL orchestrator
├── core/
│ ├── agent_manager.py # Vehicle spawning & adapter management
│ ├── adapter/
│ │ └── vehicle_adapter.py # Vehicle-agent bridge
│ ├── agents/ # Agent implementations (factory pattern)
│ ├── marl/ # RL algorithms & training infrastructure
│ │ ├── marl_manager.py # Algorithm orchestrator
│ │ ├── extractor.py # Observation feature extraction
│ │ ├── metrics.py # Training metrics & CSV export
│ │ ├── checkpoint.py # Model checkpoint management
│ │ └── algorithms/ # TD3, DQN, Q-Learning, MAPPO, SAC
│ ├── safety/ # Collision detection & avoidance
│ └── traffic/ # Traffic flow management & replay
├── envs/
│ ├── marl_env.py # Custom CARLA RL environment
│ ├── sumo_marl_env.py # SUMO-only environment
│ ├── evaluation.py # Episode evaluation
│ └── cross_agent_evaluator.py
├── gui/ # PySide6 Qt dashboard
├── scenarios/ # Scenario templates & management
└── configs/ # MARL-specific YAML configurations
| Component | Status | Description |
|---|---|---|
| Coordinator | ✅ Complete | Central orchestrator for MARL execution |
| Scenario System | ✅ Complete | Template-based scenario generation |
| Agent Manager | ✅ Complete | Multi-agent lifecycle management |
| Map Manager | ✅ Complete | Junction-based spawn point generation |
| Vehicle Adapter | ✅ Complete | RL-OpenCDA vehicle bridge |
| Environment Interface | ✅ Complete | Custom MARL environment (non-Gym) |
| Training Infrastructure | ✅ Complete | Algorithms, checkpoints, metrics |
| Traffic Configuration | ✅ Complete | Flow-based traffic generation |
| Benchmark System | ✅ Complete | Cross-agent performance comparison |
Design Philosophy¶
- OpenCDA core remains unchanged
- MARL components operate as optional extensions
- Backward compatibility maintained
- Adapter pattern for seamless integration
- Clear separation between OpenCDA and MARL components
- Coordinator-based orchestration
- Template-based scenario generation
- Custom environment with direct CARLA integration
- Five RL algorithms (TD3, DQN, Q-Learning, MAPPO, SAC)
- Flexible experiment configuration via OmegaConf
- Comprehensive callback system for extensibility
- TensorBoard integration and convergence detection
Core Implementation¶
1. MARL Coordinator¶
The central orchestrator provides unified control over multi-agent scenarios:
from opencda_marl.coordinator import MARLCoordinator
# Load configuration
config = OmegaConf.merge(
OmegaConf.load("configs/marl/default.yaml"),
OmegaConf.load("configs/marl/td3_simple_v4.yaml")
)
# Create and initialize coordinator
coordinator = MARLCoordinator(config=config)
coordinator.initialize()
# Run training
coordinator.run()
# Or run with GUI
coordinator.run_gui_mode()
2. Scenario System¶
Template-based scenario generation with factory pattern:
from opencda_marl.scenarios.scenario_builder import ScenarioBuilder
# Build scenario from configuration
builder = ScenarioBuilder()
scenario_manager = builder.build_from_config(config, cav_world)
3. Environment Interface¶
Custom MARL environment with direct CARLA integration:
from opencda_marl.envs.marl_env import MARLEnv
# Created automatically by coordinator
env = MARLEnv(scenario_manager, config=marl_config)
# Step cycle: observe → act → reward → learn
# Handled internally by coordinator.run()
4. RL Algorithms¶
All algorithms share a common BaseAlgorithm interface:
from opencda_marl.core.marl.marl_manager import MARLManager
# Algorithm selected based on config MARL.algorithm
manager = MARLManager(config)
action = manager.select_action(observations, ego_id, training=True)
manager.store_transition(obs, ego_id, action, reward, next_obs, done)
losses = manager.update()
Configuration System¶
MARL uses OmegaConf for flexible configuration management: