Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Agents Assignments

Open in Colab

Overview

In this assignment, you will implement two different types of agents for two distinct problems:

  1. Assignment 1: Reflex Agent for a Traffic Light Controller

  2. Assignment 2: Model-Based Agent for a Security Patrol

Submission Requirements:

  • Complete all TODO sections

  • Run all test cells to verify your implementations

  • Answer the reflection questions at the end

Grading:

  • Assignment 1: 40 points

  • Assignment 2: 40 points

  • Reflection Questions: 20 points

# Setup - Run this cell first
import random
import time
from collections import deque

random.seed(42)
print("✓ Setup complete!")

Assignment 1: Reflex Agent - Traffic Light Controller (40 points)

Problem Description

You will implement a Simple Reflex Agent that controls a traffic light at an intersection. The agent perceives the current traffic conditions and decides which light to show.

Environment

  • 4-way intersection with North-South (NS) and East-West (EW) traffic

  • Each direction has a sensor that detects: 'none', 'light', 'moderate', or 'heavy' traffic

  • The traffic light can be in one of two states: 'NS_GREEN' (North-South green, East-West red) or 'EW_GREEN' (East-West green, North-South red)

Agent Specification

Your reflex agent should follow these rules:

  1. If current light is 'NS_GREEN' and EW traffic is 'heavy' while NS traffic is 'none' or 'light' → Switch to 'EW_GREEN'

  2. If current light is 'EW_GREEN' and NS traffic is 'heavy' while EW traffic is 'none' or 'light' → Switch to 'NS_GREEN'

  3. If both directions have 'heavy' traffic → Keep current state (don’t switch too frequently)

  4. If both directions have 'none' traffic → Keep current state

  5. Otherwise, switch to the direction with heavier traffic

# Traffic Light Environment (PROVIDED - Do not modify)

class TrafficIntersection:
    """Simulates a 4-way traffic intersection."""
    
    TRAFFIC_LEVELS = ['none', 'light', 'moderate', 'heavy']
    
    def __init__(self):
        self.light_state = 'NS_GREEN'  # Initial state
        self.ns_traffic = 'moderate'
        self.ew_traffic = 'moderate'
        self.time_in_state = 0
        self.total_wait_time = 0
        self.cars_passed = 0
    
    def get_percept(self):
        """Returns current traffic conditions."""
        return {
            'current_light': self.light_state,
            'ns_traffic': self.ns_traffic,
            'ew_traffic': self.ew_traffic
        }
    
    def apply_action(self, action):
        """Apply agent's action to change light state."""
        if action in ['NS_GREEN', 'EW_GREEN']:
            if action != self.light_state:
                self.light_state = action
                self.time_in_state = 0
            else:
                self.time_in_state += 1
        
        # Simulate traffic flow
        self._simulate_traffic_flow()
    
    def _simulate_traffic_flow(self):
        """Simulate cars passing and new cars arriving."""
        # Cars pass through green light
        if self.light_state == 'NS_GREEN':
            ns_idx = self.TRAFFIC_LEVELS.index(self.ns_traffic)
            self.cars_passed += ns_idx
            # Reduce NS traffic, increase EW traffic slightly
            if ns_idx > 0 and random.random() < 0.3:
                self.ns_traffic = self.TRAFFIC_LEVELS[ns_idx - 1]
            ew_idx = self.TRAFFIC_LEVELS.index(self.ew_traffic)
            if ew_idx < 3 and random.random() < 0.4:
                self.ew_traffic = self.TRAFFIC_LEVELS[ew_idx + 1]
                self.total_wait_time += ew_idx + 1
        else:
            ew_idx = self.TRAFFIC_LEVELS.index(self.ew_traffic)
            self.cars_passed += ew_idx
            if ew_idx > 0 and random.random() < 0.3:
                self.ew_traffic = self.TRAFFIC_LEVELS[ew_idx - 1]
            ns_idx = self.TRAFFIC_LEVELS.index(self.ns_traffic)
            if ns_idx < 3 and random.random() < 0.4:
                self.ns_traffic = self.TRAFFIC_LEVELS[ns_idx + 1]
                self.total_wait_time += ns_idx + 1
        
        # Random traffic changes
        if random.random() < 0.2:
            self.ns_traffic = random.choice(self.TRAFFIC_LEVELS)
        if random.random() < 0.2:
            self.ew_traffic = random.choice(self.TRAFFIC_LEVELS)
    
    def get_performance(self):
        """Returns performance metrics."""
        return {
            'cars_passed': self.cars_passed,
            'total_wait_time': self.total_wait_time,
            'efficiency': self.cars_passed / max(1, self.total_wait_time)
        }

print("✓ TrafficIntersection class loaded!")
# TODO: Implement the Traffic Light Reflex Agent

class TrafficLightReflexAgent:
    """
    A simple reflex agent for controlling traffic lights.
    
    The agent decides which direction gets the green light based
    solely on the current traffic conditions (no memory of past states).
    """
    
    TRAFFIC_PRIORITY = {'none': 0, 'light': 1, 'moderate': 2, 'heavy': 3}
    
    def __init__(self):
        self.name = "Traffic Light Reflex Agent"
    
    def get_traffic_level(self, traffic):
        """Convert traffic string to numeric level."""
        return self.TRAFFIC_PRIORITY.get(traffic, 0)
    
    def decide(self, percept):
        """
        Decide which light state to set based on current percept.
        
        Args:
            percept: dict with keys 'current_light', 'ns_traffic', 'ew_traffic'
        
        Returns:
            'NS_GREEN' or 'EW_GREEN'
        """
        current_light = percept['current_light']
        ns_traffic = percept['ns_traffic']
        ew_traffic = percept['ew_traffic']
        
        ns_level = self.get_traffic_level(ns_traffic)
        ew_level = self.get_traffic_level(ew_traffic)
        
        # TODO: Implement the decision logic based on the rules above
        # Rule 1: If current is NS_GREEN and EW is heavy while NS is none/light -> EW_GREEN
        # Rule 2: If current is EW_GREEN and NS is heavy while EW is none/light -> NS_GREEN
        # Rule 3: If both heavy -> keep current
        # Rule 4: If both none -> keep current
        # Rule 5: Otherwise -> switch to heavier traffic direction
        
        # YOUR CODE HERE (replace 'pass' with your implementation)
        pass
        
        # Default: keep current state (remove this when you implement)
        return current_light

print("✓ TrafficLightReflexAgent class created (needs implementation)")
# Test your Traffic Light Reflex Agent

def test_traffic_agent():
    """Test the traffic light agent with various scenarios."""
    agent = TrafficLightReflexAgent()
    
    test_cases = [
        # (percept, expected_action, description)
        ({'current_light': 'NS_GREEN', 'ns_traffic': 'light', 'ew_traffic': 'heavy'}, 
         'EW_GREEN', "Heavy EW traffic, light NS -> should switch to EW"),
        
        ({'current_light': 'EW_GREEN', 'ns_traffic': 'heavy', 'ew_traffic': 'none'}, 
         'NS_GREEN', "Heavy NS traffic, no EW -> should switch to NS"),
        
        ({'current_light': 'NS_GREEN', 'ns_traffic': 'heavy', 'ew_traffic': 'heavy'}, 
         'NS_GREEN', "Both heavy -> should keep current (NS)"),
        
        ({'current_light': 'EW_GREEN', 'ns_traffic': 'none', 'ew_traffic': 'none'}, 
         'EW_GREEN', "Both none -> should keep current (EW)"),
        
        ({'current_light': 'NS_GREEN', 'ns_traffic': 'light', 'ew_traffic': 'moderate'}, 
         'EW_GREEN', "EW has more traffic -> should switch to EW"),
    ]
    
    passed = 0
    for percept, expected, description in test_cases:
        result = agent.decide(percept)
        status = "✓" if result == expected else "✗"
        if result == expected:
            passed += 1
        print(f"{status} {description}")
        print(f"   Expected: {expected}, Got: {result}")
    
    print(f"\n{'='*50}")
    print(f"Tests passed: {passed}/{len(test_cases)}")
    return passed == len(test_cases)

# Run tests
test_traffic_agent()
# Run simulation with your agent

def run_traffic_simulation(agent, steps=50):
    """Run a full simulation of the traffic intersection."""
    env = TrafficIntersection()
    
    print(f"Running {steps}-step simulation with {agent.name}...\n")
    
    for step in range(steps):
        percept = env.get_percept()
        action = agent.decide(percept)
        env.apply_action(action)
        
        if step % 10 == 0:
            print(f"Step {step}: Light={percept['current_light']}, "
                  f"NS={percept['ns_traffic']}, EW={percept['ew_traffic']} "
                  f"-> Action: {action}")
    
    perf = env.get_performance()
    print(f"\n{'='*50}")
    print(f"Final Performance:")
    print(f"  Cars passed: {perf['cars_passed']}")
    print(f"  Total wait time: {perf['total_wait_time']}")
    print(f"  Efficiency score: {perf['efficiency']:.2f}")
    return perf

# Run simulation
agent = TrafficLightReflexAgent()
performance = run_traffic_simulation(agent)

Assignment 2: Model-Based Agent - Security Patrol Agent (40 points)

Problem Description

You will implement a Model-Based Reflex Agent that patrols a building with multiple rooms. Unlike the reflex agent, this agent maintains an internal model to remember which rooms it has visited.

Environment

  • A building with 6 rooms arranged in a line (Room 0 to Room 5)

  • The agent can move: 'LEFT', 'RIGHT', or 'STAY'

  • Each room can have a security event: 'clear' or 'alert'

  • The agent earns points for checking rooms and responding to alerts

Agent Specification

Your model-based agent should:

  1. Track visit history: Remember when each room was last visited

  2. Prioritize unvisited rooms: Prefer rooms not checked recently

  3. Respond to alerts: Always move toward rooms with alerts

  4. Patrol efficiently: Don’t stay in one place too long

Key difference from reflex agent: The model-based agent remembers past visits to make better patrol decisions.

# Building Environment (PROVIDED - Do not modify)

class BuildingEnvironment:
    """Simulates a building with 6 rooms for security patrol."""
    
    def __init__(self):
        self.num_rooms = 6
        self.agent_position = 0  # Start in room 0
        self.room_status = ['clear'] * self.num_rooms  # All rooms start clear
        self.steps = 0
        self.score = 0
        self.alerts_handled = 0
        self.rooms_checked = set()
    
    def get_percept(self):
        """Returns what the agent can see from current position."""
        # Agent can see current room and adjacent rooms
        visible_alerts = []
        for i in range(max(0, self.agent_position - 1), 
                       min(self.num_rooms, self.agent_position + 2)):
            if self.room_status[i] == 'alert':
                visible_alerts.append(i)
        
        return {
            'current_room': self.agent_position,
            'current_status': self.room_status[self.agent_position],
            'visible_alerts': visible_alerts,
            'num_rooms': self.num_rooms
        }
    
    def apply_action(self, action):
        """Move the agent and update environment."""
        self.steps += 1
        
        # Move agent
        if action == 'LEFT' and self.agent_position > 0:
            self.agent_position -= 1
        elif action == 'RIGHT' and self.agent_position < self.num_rooms - 1:
            self.agent_position += 1
        # STAY keeps position
        
        # Check current room (agent presence clears alerts)
        self.rooms_checked.add(self.agent_position)
        if self.room_status[self.agent_position] == 'alert':
            self.room_status[self.agent_position] = 'clear'
            self.alerts_handled += 1
            self.score += 20  # Bonus for handling alert
        else:
            self.score += 1  # Small score for patrolling
        
        # Random events: new alerts may appear
        if random.random() < 0.15:  # 15% chance of new alert
            alert_room = random.randint(0, self.num_rooms - 1)
            if alert_room != self.agent_position:
                self.room_status[alert_room] = 'alert'
    
    def get_performance(self):
        """Returns performance metrics."""
        coverage = len(self.rooms_checked) / self.num_rooms
        return {
            'score': self.score,
            'alerts_handled': self.alerts_handled,
            'coverage': coverage,
            'steps': self.steps
        }

print("✓ BuildingEnvironment class loaded!")
# TODO: Implement the Security Patrol Model-Based Agent

class SecurityPatrolAgent:
    """
    A model-based agent for security patrol.
    
    This agent maintains an internal model of:
    - Last visit time for each room
    - Current step count (to track time)
    
    Unlike a reflex agent, this agent REMEMBERS which rooms
    it has visited to make better patrol decisions.
    """
    
    def __init__(self, num_rooms=6):
        self.name = "Security Patrol Model-Based Agent"
        self.num_rooms = num_rooms
        
        # Internal model: track when each room was last visited
        # Initialize all rooms as "never visited" (step -100)
        self.last_visit = {i: -100 for i in range(num_rooms)}
        self.current_step = 0
        
    def update_model(self, percept):
        """
        Update internal model based on new percept.
        
        Args:
            percept: dict with 'current_room', 'current_status', 'visible_alerts', 'num_rooms'
        """
        # TODO: Update the last_visit time for current room
        # Hint: Set last_visit[current_room] = current_step
        #       Then increment current_step
        
        # YOUR CODE HERE
        pass
    
    def get_least_recently_visited(self, current_room):
        """
        Find which adjacent room was visited least recently.
        
        Args:
            current_room: The room the agent is currently in
        
        Returns:
            'LEFT', 'RIGHT', or 'STAY'
        """
        # TODO: Compare last_visit times for adjacent rooms
        # Return direction toward the room visited longest ago
        # Hint: Check rooms current_room-1 (LEFT) and current_room+1 (RIGHT)
        
        # YOUR CODE HERE
        return 'RIGHT'  # Default, replace with your implementation
    
    def decide(self, percept):
        """
        Decide movement based on percept and internal model.
        
        Args:
            percept: dict with 'current_room', 'current_status', 'visible_alerts', 'num_rooms'
        
        Returns:
            'LEFT', 'RIGHT', or 'STAY'
        """
        # Step 1: Update internal model (record visit)
        self.update_model(percept)
        
        current_room = percept['current_room']
        visible_alerts = percept['visible_alerts']
        
        # TODO: Implement decision logic
        # Priority 1: If there's a visible alert, move toward it
        # Priority 2: Otherwise, move toward least recently visited room
        
        # Hint for alerts:
        # - If alert is to the LEFT (alert_room < current_room), return 'LEFT'
        # - If alert is to the RIGHT (alert_room > current_room), return 'RIGHT'
        # - If alert is in current room, it will be handled automatically
        
        # YOUR CODE HERE
        
        return 'STAY'  # Default, replace with your implementation

print("✓ SecurityPatrolAgent class created (needs implementation)")
# Test your Security Patrol Agent

def test_patrol_agent():
    """Test the patrol agent with various scenarios."""
    agent = SecurityPatrolAgent(num_rooms=6)
    
    test_cases = [
        # (percept, expected_action, description)
        ({'current_room': 2, 'current_status': 'clear', 'visible_alerts': [3], 'num_rooms': 6},
         'RIGHT', "Alert to the right -> should move RIGHT"),
        
        ({'current_room': 3, 'current_status': 'clear', 'visible_alerts': [2], 'num_rooms': 6},
         'LEFT', "Alert to the left -> should move LEFT"),
        
        ({'current_room': 0, 'current_status': 'clear', 'visible_alerts': [], 'num_rooms': 6},
         'RIGHT', "At left edge, no alerts -> should move RIGHT"),
        
        ({'current_room': 5, 'current_status': 'clear', 'visible_alerts': [], 'num_rooms': 6},
         'LEFT', "At right edge, no alerts -> should move LEFT"),
    ]
    
    passed = 0
    for percept, expected, description in test_cases:
        result = agent.decide(percept)
        is_correct = result == expected
        status = "✓" if is_correct else "✗"
        if is_correct:
            passed += 1
        print(f"{status} {description}")
        print(f"   Expected: {expected}, Got: {result}")
    
    print(f"\n{'='*50}")
    print(f"Tests passed: {passed}/{len(test_cases)}")
    return passed >= 3  # Pass if at least 3 tests pass

# Run tests
test_patrol_agent()
# Run patrol simulation with your agent

def run_patrol_simulation(agent, steps=30):
    """Run a patrol simulation."""
    env = BuildingEnvironment()
    
    print(f"Running {steps}-step simulation with {agent.name}...\n")
    print(f"{'Step':>4} | {'Room':>4} | {'Status':>8} | {'Alerts':>10} | {'Action':>6}")
    print("-" * 50)
    
    for step in range(steps):
        percept = env.get_percept()
        action = agent.decide(percept)
        
        alerts_str = str(percept['visible_alerts']) if percept['visible_alerts'] else "None"
        print(f"{step:>4} | {percept['current_room']:>4} | {percept['current_status']:>8} | "
              f"{alerts_str:>10} | {action:>6}")
        
        env.apply_action(action)
    
    perf = env.get_performance()
    print(f"\n{'='*50}")
    print(f"Final Performance:")
    print(f"  Total score: {perf['score']}")
    print(f"  Alerts handled: {perf['alerts_handled']}")
    print(f"  Room coverage: {perf['coverage']*100:.0f}%")
    return perf

# Run simulation
agent = SecurityPatrolAgent()
performance = run_patrol_simulation(agent)

Reflection Questions (20 points)

Answer the following questions in the cells below:

Question 1 (5 points)

What are the key limitations of the reflex agent (Traffic Light Controller) compared to the model-based agent (Security Patrol)? Give specific examples from your implementations.

Your answer here:

Question 2 (5 points)

How does maintaining an internal model (visit history) help the patrol agent make better decisions? What would happen if it only used current percepts like a reflex agent?

Your answer here:

Question 3 (5 points)

Could the traffic light controller benefit from being a model-based agent? What kind of internal model would you add, and how would it improve performance?

Your answer here:

Question 4 (5 points)

For each agent, identify the PEAS components (Performance measure, Environment, Actuators, Sensors):

Traffic Light Controller:

  • Performance:

  • Environment:

  • Actuators:

  • Sensors:

Security Patrol Agent:

  • Performance:

  • Environment:

  • Actuators:

  • Sensors:


Submission Checklist

Before submitting, make sure:

  • Assignment 1: TrafficLightReflexAgent.decide() is implemented

  • Assignment 1: All test cases pass

  • Assignment 2: SecurityPatrolAgent methods are implemented:

    • update_model()

    • get_least_recently_visited()

    • decide()

  • Assignment 2: At least 3 test cases pass

  • All 4 reflection questions are answered

  • All cells run without errors

Good luck! 🚀