Lab Overview¶
In this lab, you will solve multiagent search tasks framed as realistic applications rather than board games. Each exercise assesses both implementation and reasoning quality.
Real-World Cases¶
Emergency route planning under adversarial traffic
Cybersecurity defense against adaptive attackers
Warehouse robot dispatch with depth-limited search
Dynamic pricing and bidding with Monte Carlo Tree Search
Strategy tournament and evidence-based analysis
Assessment Focus¶
Correct modeling assumptions
Algorithm implementation quality
Performance metrics and interpretation
Explanation of why a strategy is suitable for a domain
Assessment Rubric¶
Each exercise is graded on both implementation and reasoning.
| Criterion | Weight | What is evaluated |
|---|---|---|
| Correctness | 40% | Algorithms return valid actions/values in the given scenario |
| Efficiency | 20% | Use of pruning/simulation budget and measured performance |
| Modeling quality | 20% | Whether assumptions and state design match the real-world problem |
| Explanation quality | 20% | Clear interpretation of results and algorithm choice rationale |
To score well, include short written explanations after each exercise describing tradeoffs and failure cases.
Exercise 1: Emergency Route Planning with Minimax¶
A city dispatch center must choose an ambulance route while an adversarial traffic model introduces worst-case congestion events. Implement Minimax to choose the safest route under worst-case traffic responses.
import random
class EmergencyRouteState:
"""
Simplified adversarial route-planning state.
MAX = dispatcher selecting route, MIN = adversarial congestion.
"""
def __init__(self, stage='dispatch', route=None, traffic_event=None):
self.stage = stage
self.route = route
self.traffic_event = traffic_event
def available_actions(self):
if self.stage == 'dispatch':
return ['highway', 'arterial', 'local']
if self.stage == 'traffic':
return ['none', 'incident', 'signal_failure']
return []
def apply(self, action):
if self.stage == 'dispatch':
return EmergencyRouteState(stage='traffic', route=action)
if self.stage == 'traffic':
return EmergencyRouteState(stage='terminal', route=self.route, traffic_event=action)
return self
def is_terminal(self):
return self.stage == 'terminal'
def utility(self, dispatcher='MAX'):
# Higher score means faster/safer response
base = {'highway': 80, 'arterial': 70, 'local': 60}[self.route]
penalty = {'none': 0, 'incident': 25, 'signal_failure': 15}[self.traffic_event]
return base - penalty
def minimax_route(state, maximizing=True):
"""
TODO: Implement Minimax for emergency route planning.
Return: (best_value, best_action)
"""
pass
# Quick sanity scenario
root = EmergencyRouteState()
print('Emergency route planning scenario loaded.')
print('Dispatcher actions:', root.available_actions())Exercise 2: Cybersecurity Defense with Alpha-Beta¶
Model a defender-attacker interaction where the defender selects mitigations and the attacker selects exploit paths. Implement Alpha-Beta pruning and compare efficiency against plain Minimax.
import time
class CyberDefenseState:
"""
Defender (MAX) picks control policy; attacker (MIN) picks attack path.
"""
def __init__(self, stage='defense', defense=None, attack=None):
self.stage = stage
self.defense = defense
self.attack = attack
def available_actions(self):
if self.stage == 'defense':
return ['patch_critical', 'harden_network', 'increase_monitoring']
if self.stage == 'attack':
return ['phishing_chain', 'credential_stuffing', 'lateral_movement']
return []
def apply(self, action):
if self.stage == 'defense':
return CyberDefenseState(stage='attack', defense=action)
if self.stage == 'attack':
return CyberDefenseState(stage='terminal', defense=self.defense, attack=action)
return self
def is_terminal(self):
return self.stage == 'terminal'
def utility(self):
impact = {
('patch_critical', 'phishing_chain'): 60,
('patch_critical', 'credential_stuffing'): 75,
('patch_critical', 'lateral_movement'): 45,
('harden_network', 'phishing_chain'): 70,
('harden_network', 'credential_stuffing'): 55,
('harden_network', 'lateral_movement'): 80,
('increase_monitoring', 'phishing_chain'): 85,
('increase_monitoring', 'credential_stuffing'): 65,
('increase_monitoring', 'lateral_movement'): 60,
}
return impact[(self.defense, self.attack)]
def alpha_beta_cyber(state, alpha, beta, maximizing=True):
"""
TODO: Implement Alpha-Beta pruning for cyber defense planning.
Return: (best_value, best_action, nodes_evaluated, nodes_pruned)
"""
pass
def compare_cyber_algorithms(state):
"""TODO: Compare Minimax vs Alpha-Beta on the same cyber state."""
pass
cyber_root = CyberDefenseState()
print('Cybersecurity planning scenario loaded.')Exercise 3: Warehouse Robot Dispatch (Depth-Limited Search)¶
A warehouse robot must choose task dispatch actions while uncertainty in congestion and battery constraints affects downstream outcomes. Build a depth-limited search agent with a practical evaluation function.
class WarehouseDispatchState:
def __init__(self, backlog=12, battery=100, congestion=0, depth=0, max_depth=4):
self.backlog = backlog
self.battery = battery
self.congestion = congestion
self.depth = depth
self.max_depth = max_depth
def available_actions(self):
return ['shortest_path', 'energy_saving_route', 'batch_pick_strategy']
def apply(self, action):
# Simple dynamics for lab experimentation
if action == 'shortest_path':
return WarehouseDispatchState(
backlog=max(0, self.backlog - 4),
battery=max(0, self.battery - 18),
congestion=min(10, self.congestion + 2),
depth=self.depth + 1,
max_depth=self.max_depth,
)
if action == 'energy_saving_route':
return WarehouseDispatchState(
backlog=max(0, self.backlog - 2),
battery=max(0, self.battery - 8),
congestion=max(0, self.congestion - 1),
depth=self.depth + 1,
max_depth=self.max_depth,
)
return WarehouseDispatchState(
backlog=max(0, self.backlog - 3),
battery=max(0, self.battery - 12),
congestion=min(10, self.congestion + 1),
depth=self.depth + 1,
max_depth=self.max_depth,
)
def is_terminal(self):
return self.backlog == 0 or self.battery <= 5 or self.depth >= self.max_depth
def evaluate_dispatch(state):
"""
TODO: Design evaluation function.
Suggested factors: lower backlog, higher battery, lower congestion.
Return higher score for better states.
"""
pass
def alpha_beta_depth_limited_dispatch(state, alpha, beta, depth, max_depth, maximizing):
"""
TODO: Implement depth-limited alpha-beta using evaluate_dispatch().
"""
pass
dispatch_state = WarehouseDispatchState()
print('Warehouse dispatch scenario loaded.')Exercise 4: Dynamic Pricing and Bidding with MCTS¶
Simulate a pricing agent in a competitive market where outcomes are uncertain and branching is large. Implement MCTS with UCB1 to choose robust actions under limited simulation budget.
import math
import random
class PricingMarketState:
"""
Dynamic pricing state for MCTS.
Agent chooses price adjustment; market uncertainty responds.
"""
def __init__(self, demand_level=0.6, competitor_pressure=0.5, step=0, max_steps=6):
self.demand_level = demand_level
self.competitor_pressure = competitor_pressure
self.step = step
self.max_steps = max_steps
def available_moves(self):
return ['price_up', 'price_down', 'hold']
def apply(self, action):
demand = self.demand_level
pressure = self.competitor_pressure
if action == 'price_up':
demand = max(0.0, demand - 0.1)
elif action == 'price_down':
demand = min(1.0, demand + 0.1)
# stochastic market reaction will be applied during simulation
return PricingMarketState(demand, pressure, self.step + 1, self.max_steps)
def is_terminal(self):
return self.step >= self.max_steps
def reward(self):
margin = 0.5 + (0.2 if self.demand_level < 0.5 else -0.1)
revenue_signal = self.demand_level * (1.0 - self.competitor_pressure)
return 100 * (margin + revenue_signal)
class MCTSNode:
def __init__(self, state, parent=None, action=None):
self.state = state
self.parent = parent
self.action = action
self.children = []
self.wins = 0.0
self.visits = 0
self.untried_actions = state.available_moves()[:]
def ucb1(self, c=1.41):
"""
TODO: Calculate UCB1 value.
UCB1 = wins/visits + c * sqrt(ln(parent_visits) / visits)
"""
pass
def select_child(self):
return max(self.children, key=lambda child: child.ucb1())
def add_child(self, action, state):
child = MCTSNode(state, parent=self, action=action)
self.untried_actions.remove(action)
self.children.append(child)
return child
def update(self, result):
self.visits += 1
self.wins += result
def mcts_search(root_state, num_simulations=1000):
"""
TODO: Implement MCTS algorithm with four phases:
1. Selection
2. Expansion
3. Simulation
4. Backpropagation
Return best action from root.
"""
pass
def simulate_random_market(state):
"""TODO: Roll out random pricing sequence until terminal and return total reward."""
pass
market = PricingMarketState()
print('Dynamic pricing MCTS scenario loaded.')Exercise 5: Cross-Domain Strategy Tournament¶
Evaluate strategies across multiple domains and report strengths, weaknesses, and conditions where each algorithm is preferable.
def play_scenario(strategy_a, strategy_b, scenario_factory):
"""
Play one adversarial episode in a scenario.
strategy signatures: strategy(state) -> action
"""
state = scenario_factory()
turn = 0
while not state.is_terminal() and turn < 20:
action = strategy_a(state) if turn % 2 == 0 else strategy_b(state)
state = state.apply(action)
turn += 1
return state
def tournament(strategies, scenario_factory, num_games=20):
"""
TODO: Run round-robin tournament for a given scenario.
Track average reward/outcome for each strategy.
"""
pass
# Example strategy templates for students to replace
strategies = {
'Random': lambda state: random.choice(state.available_actions() if hasattr(state, 'available_actions') else state.available_moves()),
'GreedyPlaceholder': lambda state: (state.available_actions() if hasattr(state, 'available_actions') else state.available_moves())[0],
}
print('Tournament framework loaded. Implement strategy wrappers from your exercises.')Evaluation Checklist (Student Self-Check)¶
Before submitting, confirm for each exercise:
You implemented all
TODOblocks.You reported at least one quantitative metric (time, nodes, prunes, or simulations).
You explained one limitation or modeling assumption.
You justified why the chosen algorithm fits the scenario.
Challenge: Iterative Deepening for Real-Time Decision Support¶
Design an iterative deepening variant for time-constrained systems (for example, emergency response dispatch or active cyber defense) where decisions must be returned before hard deadlines.
def iterative_deepening_decision(initial_state, decision_fn, time_limit=5.0):
"""
CHALLENGE: Implement iterative deepening for time-constrained domains.
Parameters:
- initial_state: domain state
- decision_fn: callable(state, depth) -> (value, action)
- time_limit: seconds
Goal:
- Increase depth progressively until time is exhausted.
- Return best fully-evaluated action found so far.
"""
passLab Report¶
Deliverables¶
All scenario algorithms implemented
Results for all real-world cases
Evaluation function design notes with feature rationale
Performance comparison table with metrics
Short reflection on modeling assumptions and limitations
Discussion Questions¶
How did adversarial assumptions change your decisions in each domain?
Where did Alpha-Beta provide the largest benefit and why?
Which evaluation features were most influential in depth-limited search?
When did MCTS outperform deterministic search methods?
What are the risks of model misspecification in real-world multiagent systems?