Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Chapter 3 Lab: Multiagent Search (Real-World Assessment)

Lab Overview

In this lab, you will solve multiagent search tasks framed as realistic applications rather than board games. Each exercise assesses both implementation and reasoning quality.

Real-World Cases

  1. Emergency route planning under adversarial traffic

  2. Cybersecurity defense against adaptive attackers

  3. Warehouse robot dispatch with depth-limited search

  4. Dynamic pricing and bidding with Monte Carlo Tree Search

  5. Strategy tournament and evidence-based analysis

Assessment Focus

  • Correct modeling assumptions

  • Algorithm implementation quality

  • Performance metrics and interpretation

  • Explanation of why a strategy is suitable for a domain

Assessment Rubric

Each exercise is graded on both implementation and reasoning.

CriterionWeightWhat is evaluated
Correctness40%Algorithms return valid actions/values in the given scenario
Efficiency20%Use of pruning/simulation budget and measured performance
Modeling quality20%Whether assumptions and state design match the real-world problem
Explanation quality20%Clear interpretation of results and algorithm choice rationale

To score well, include short written explanations after each exercise describing tradeoffs and failure cases.

Exercise 1: Emergency Route Planning with Minimax

A city dispatch center must choose an ambulance route while an adversarial traffic model introduces worst-case congestion events. Implement Minimax to choose the safest route under worst-case traffic responses.

import random

class EmergencyRouteState:
    """
    Simplified adversarial route-planning state.
    MAX = dispatcher selecting route, MIN = adversarial congestion.
    """
    def __init__(self, stage='dispatch', route=None, traffic_event=None):
        self.stage = stage
        self.route = route
        self.traffic_event = traffic_event

    def available_actions(self):
        if self.stage == 'dispatch':
            return ['highway', 'arterial', 'local']
        if self.stage == 'traffic':
            return ['none', 'incident', 'signal_failure']
        return []

    def apply(self, action):
        if self.stage == 'dispatch':
            return EmergencyRouteState(stage='traffic', route=action)
        if self.stage == 'traffic':
            return EmergencyRouteState(stage='terminal', route=self.route, traffic_event=action)
        return self

    def is_terminal(self):
        return self.stage == 'terminal'

    def utility(self, dispatcher='MAX'):
        # Higher score means faster/safer response
        base = {'highway': 80, 'arterial': 70, 'local': 60}[self.route]
        penalty = {'none': 0, 'incident': 25, 'signal_failure': 15}[self.traffic_event]
        return base - penalty


def minimax_route(state, maximizing=True):
    """
    TODO: Implement Minimax for emergency route planning.
    Return: (best_value, best_action)
    """
    pass


# Quick sanity scenario
root = EmergencyRouteState()
print('Emergency route planning scenario loaded.')
print('Dispatcher actions:', root.available_actions())

Exercise 2: Cybersecurity Defense with Alpha-Beta

Model a defender-attacker interaction where the defender selects mitigations and the attacker selects exploit paths. Implement Alpha-Beta pruning and compare efficiency against plain Minimax.

import time

class CyberDefenseState:
    """
    Defender (MAX) picks control policy; attacker (MIN) picks attack path.
    """
    def __init__(self, stage='defense', defense=None, attack=None):
        self.stage = stage
        self.defense = defense
        self.attack = attack

    def available_actions(self):
        if self.stage == 'defense':
            return ['patch_critical', 'harden_network', 'increase_monitoring']
        if self.stage == 'attack':
            return ['phishing_chain', 'credential_stuffing', 'lateral_movement']
        return []

    def apply(self, action):
        if self.stage == 'defense':
            return CyberDefenseState(stage='attack', defense=action)
        if self.stage == 'attack':
            return CyberDefenseState(stage='terminal', defense=self.defense, attack=action)
        return self

    def is_terminal(self):
        return self.stage == 'terminal'

    def utility(self):
        impact = {
            ('patch_critical', 'phishing_chain'): 60,
            ('patch_critical', 'credential_stuffing'): 75,
            ('patch_critical', 'lateral_movement'): 45,
            ('harden_network', 'phishing_chain'): 70,
            ('harden_network', 'credential_stuffing'): 55,
            ('harden_network', 'lateral_movement'): 80,
            ('increase_monitoring', 'phishing_chain'): 85,
            ('increase_monitoring', 'credential_stuffing'): 65,
            ('increase_monitoring', 'lateral_movement'): 60,
        }
        return impact[(self.defense, self.attack)]


def alpha_beta_cyber(state, alpha, beta, maximizing=True):
    """
    TODO: Implement Alpha-Beta pruning for cyber defense planning.
    Return: (best_value, best_action, nodes_evaluated, nodes_pruned)
    """
    pass


def compare_cyber_algorithms(state):
    """TODO: Compare Minimax vs Alpha-Beta on the same cyber state."""
    pass


cyber_root = CyberDefenseState()
print('Cybersecurity planning scenario loaded.')

A warehouse robot must choose task dispatch actions while uncertainty in congestion and battery constraints affects downstream outcomes. Build a depth-limited search agent with a practical evaluation function.

class WarehouseDispatchState:
    def __init__(self, backlog=12, battery=100, congestion=0, depth=0, max_depth=4):
        self.backlog = backlog
        self.battery = battery
        self.congestion = congestion
        self.depth = depth
        self.max_depth = max_depth

    def available_actions(self):
        return ['shortest_path', 'energy_saving_route', 'batch_pick_strategy']

    def apply(self, action):
        # Simple dynamics for lab experimentation
        if action == 'shortest_path':
            return WarehouseDispatchState(
                backlog=max(0, self.backlog - 4),
                battery=max(0, self.battery - 18),
                congestion=min(10, self.congestion + 2),
                depth=self.depth + 1,
                max_depth=self.max_depth,
            )
        if action == 'energy_saving_route':
            return WarehouseDispatchState(
                backlog=max(0, self.backlog - 2),
                battery=max(0, self.battery - 8),
                congestion=max(0, self.congestion - 1),
                depth=self.depth + 1,
                max_depth=self.max_depth,
            )
        return WarehouseDispatchState(
            backlog=max(0, self.backlog - 3),
            battery=max(0, self.battery - 12),
            congestion=min(10, self.congestion + 1),
            depth=self.depth + 1,
            max_depth=self.max_depth,
        )

    def is_terminal(self):
        return self.backlog == 0 or self.battery <= 5 or self.depth >= self.max_depth


def evaluate_dispatch(state):
    """
    TODO: Design evaluation function.
    Suggested factors: lower backlog, higher battery, lower congestion.
    Return higher score for better states.
    """
    pass


def alpha_beta_depth_limited_dispatch(state, alpha, beta, depth, max_depth, maximizing):
    """
    TODO: Implement depth-limited alpha-beta using evaluate_dispatch().
    """
    pass


dispatch_state = WarehouseDispatchState()
print('Warehouse dispatch scenario loaded.')

Exercise 4: Dynamic Pricing and Bidding with MCTS

Simulate a pricing agent in a competitive market where outcomes are uncertain and branching is large. Implement MCTS with UCB1 to choose robust actions under limited simulation budget.

import math
import random

class PricingMarketState:
    """
    Dynamic pricing state for MCTS.
    Agent chooses price adjustment; market uncertainty responds.
    """
    def __init__(self, demand_level=0.6, competitor_pressure=0.5, step=0, max_steps=6):
        self.demand_level = demand_level
        self.competitor_pressure = competitor_pressure
        self.step = step
        self.max_steps = max_steps

    def available_moves(self):
        return ['price_up', 'price_down', 'hold']

    def apply(self, action):
        demand = self.demand_level
        pressure = self.competitor_pressure
        if action == 'price_up':
            demand = max(0.0, demand - 0.1)
        elif action == 'price_down':
            demand = min(1.0, demand + 0.1)
        # stochastic market reaction will be applied during simulation
        return PricingMarketState(demand, pressure, self.step + 1, self.max_steps)

    def is_terminal(self):
        return self.step >= self.max_steps

    def reward(self):
        margin = 0.5 + (0.2 if self.demand_level < 0.5 else -0.1)
        revenue_signal = self.demand_level * (1.0 - self.competitor_pressure)
        return 100 * (margin + revenue_signal)


class MCTSNode:
    def __init__(self, state, parent=None, action=None):
        self.state = state
        self.parent = parent
        self.action = action
        self.children = []
        self.wins = 0.0
        self.visits = 0
        self.untried_actions = state.available_moves()[:]

    def ucb1(self, c=1.41):
        """
        TODO: Calculate UCB1 value.
        UCB1 = wins/visits + c * sqrt(ln(parent_visits) / visits)
        """
        pass

    def select_child(self):
        return max(self.children, key=lambda child: child.ucb1())

    def add_child(self, action, state):
        child = MCTSNode(state, parent=self, action=action)
        self.untried_actions.remove(action)
        self.children.append(child)
        return child

    def update(self, result):
        self.visits += 1
        self.wins += result


def mcts_search(root_state, num_simulations=1000):
    """
    TODO: Implement MCTS algorithm with four phases:
    1. Selection
    2. Expansion
    3. Simulation
    4. Backpropagation
    Return best action from root.
    """
    pass


def simulate_random_market(state):
    """TODO: Roll out random pricing sequence until terminal and return total reward."""
    pass


market = PricingMarketState()
print('Dynamic pricing MCTS scenario loaded.')

Exercise 5: Cross-Domain Strategy Tournament

Evaluate strategies across multiple domains and report strengths, weaknesses, and conditions where each algorithm is preferable.

def play_scenario(strategy_a, strategy_b, scenario_factory):
    """
    Play one adversarial episode in a scenario.
    strategy signatures: strategy(state) -> action
    """
    state = scenario_factory()
    turn = 0
    while not state.is_terminal() and turn < 20:
        action = strategy_a(state) if turn % 2 == 0 else strategy_b(state)
        state = state.apply(action)
        turn += 1
    return state


def tournament(strategies, scenario_factory, num_games=20):
    """
    TODO: Run round-robin tournament for a given scenario.
    Track average reward/outcome for each strategy.
    """
    pass


# Example strategy templates for students to replace
strategies = {
    'Random': lambda state: random.choice(state.available_actions() if hasattr(state, 'available_actions') else state.available_moves()),
    'GreedyPlaceholder': lambda state: (state.available_actions() if hasattr(state, 'available_actions') else state.available_moves())[0],
}


print('Tournament framework loaded. Implement strategy wrappers from your exercises.')

Evaluation Checklist (Student Self-Check)

Before submitting, confirm for each exercise:

  • You implemented all TODO blocks.

  • You reported at least one quantitative metric (time, nodes, prunes, or simulations).

  • You explained one limitation or modeling assumption.

  • You justified why the chosen algorithm fits the scenario.

Challenge: Iterative Deepening for Real-Time Decision Support

Design an iterative deepening variant for time-constrained systems (for example, emergency response dispatch or active cyber defense) where decisions must be returned before hard deadlines.

def iterative_deepening_decision(initial_state, decision_fn, time_limit=5.0):
    """
    CHALLENGE: Implement iterative deepening for time-constrained domains.

    Parameters:
    - initial_state: domain state
    - decision_fn: callable(state, depth) -> (value, action)
    - time_limit: seconds

    Goal:
    - Increase depth progressively until time is exhausted.
    - Return best fully-evaluated action found so far.
    """
    pass

Lab Report

Deliverables

  • All scenario algorithms implemented

  • Results for all real-world cases

  • Evaluation function design notes with feature rationale

  • Performance comparison table with metrics

  • Short reflection on modeling assumptions and limitations

Discussion Questions

  1. How did adversarial assumptions change your decisions in each domain?

  2. Where did Alpha-Beta provide the largest benefit and why?

  3. Which evaluation features were most influential in depth-limited search?

  4. When did MCTS outperform deterministic search methods?

  5. What are the risks of model misspecification in real-world multiagent systems?