Between truth and the search for it, I choose the second.
Bernard Berenson
2.1 Introduction¶
Search is one of the most fundamental problem-solving techniques in artificial intelligence. Many AI problems can be formulated as finding a sequence of actions that transforms an initial state into a goal state.
Core Concept¶
The goal of an agent is to:
Start from an initial state
Perform a sequence of actions
Reach a goal state (or a state with high desirability)
Two Main Scenarios¶
Goal-Oriented Search: Find a path from start to goal
Example: Solving a maze, solving a puzzle
Cost may be associated with paths
Optimization Search: Find state with minimal loss
Example: Eight-queens placement, scheduling
Cost is associated with states themselves
Applications¶
Why Search Matters¶
No Direct Formula: Many problems have no closed-form solution
Large Solution Space: Too many possibilities to enumerate
Need Systematic Methods: Avoid aimless wandering
Trade-offs: Different strategies balance time, memory, and solution quality
2.1.1 State Space as a Graph¶
Modeling the state space as a directed graph enables systematic search:
Key Concepts:
Node: Represents a state
Edge: Represents a transition (action) from one state to another
Path: Sequence of edges from initial state to goal
Example: Eight-Queens Problem¶
Problem: Place 8 queens on a chessboard so none attack each other.
State Space Design:
Approach 1 (Incremental):
State: Placement of k ≤ 8 queens (no attacks)
Action: Add one more queen
Initial State: Empty board (k=0)
Goal State: 8 queens placed (k=8)
Graph Type: Directed Acyclic Graph (DAG)
Approach 2 (Complete State):
State: Any placement of 8 queens (attacks allowed)
Action: Move one queen to different square
Initial State: Random placement
Goal State: No attacks
Graph Type: Undirected graph with cycles
State Space Size¶
| Problem | State Space Size |
|---|---|
| 8-Queens (any placement) | 64C8 ≈ 4.4 × 10⁸ |
| 8-Puzzle | 9! = 362,880 |
| Chess (valid positions) | > 10⁴³ |
| Rubik’s Cube | ≈ 4.3 × 10¹⁹ |
2.2 Uninformed Search Algorithms¶
Uninformed search (also called blind search) uses only information available in the problem definition. These algorithms don’t use problem-specific knowledge to guide the search.
Generic Search Framework¶
All uninformed search algorithms follow this template:
Algorithm: GenericSearch(Initial State s, Goal Condition G)
Input: Initial state s, Goal condition G
Output: Path from s to goal, or failure
begin
LIST ← {s} // Frontier nodes to explore
VISIT ← ∅ // Visited nodes
pred ← empty array // Predecessor tracking
repeat
if LIST is empty then
return failure
// Selection strategy differs by algorithm
Select node i from LIST
Delete i from LIST
Add i to VISIT
if i satisfies G then
return reconstruct_path(pred, s, i)
// Expand node i
for each node j adjacent to i do
if j not in VISIT and j not in LIST then
Add j to LIST
pred[j] ← i
until LIST is empty or goal found
endKey Components:
LIST: Frontier nodes (boundary of explored region)
VISIT: Already explored nodes
pred: Predecessor array to reconstruct path
Selection Strategy: Determines which algorithm we get
2.2.1 Breadth-First Search (BFS)¶
BFS explores nodes level by level, visiting all nodes at depth d before any node at depth d+1.
Selection Strategy: FIFO queue (First-In-First-Out)
How it works:
Start with initial state in queue
Remove first node from queue
Check if it’s a goal
Add all unvisited neighbors to end of queue
Repeat
Properties:
| Property | Value | Explanation |
|---|---|---|
| Complete | Yes | Will find solution if one exists |
| Optimal | Yes* | Finds shallowest (shortest path) goal |
| Time Complexity | O(b^d) | Exponential in depth |
| Space Complexity | O(b^d) | Must store entire frontier |
*Only when all actions have equal cost
Where:
b: branching factor (average number of successors)
d: depth of shallowest goal
Advantages:
Guaranteed to find solution
Finds shortest path (minimum edges)
Disadvantages:
Memory intensive: Must store all nodes at current level
Slow for deep goals
Example: Finding path from A to G
Step 1: Queue = [A] Visit A
Step 2: Queue = [B, C] Visit B, add children
Step 3: Queue = [C, D, E] Visit C, add children
Step 4: Queue = [D, E, F] Visit D
...2.2.2 Depth-First Search (DFS)¶
DFS explores as deep as possible along each branch before backtracking.
Selection Strategy: LIFO stack (Last-In-First-Out)
How it works:
Start with initial state on stack
Pop top node from stack
Check if it’s a goal
Push all unvisited neighbors onto stack
Repeat
Properties:
| Property | Value | Explanation |
|---|---|---|
| Complete | No | Can get stuck in infinite paths |
| Optimal | No | May find longer paths first |
| Time Complexity | O(b^m) | Can explore entire tree |
| Space Complexity | O(bm) | Only stores single path |
Where:
m: maximum depth of any node
Advantages:
Memory efficient: Only stores nodes on current path
Fast if solutions are deep
Disadvantages:
Can get stuck in infinite loops (needs cycle detection)
May find suboptimal solutions
Not complete in infinite spaces
Example: Finding path from A to G
Step 1: Stack = [A] Visit A
Step 2: Stack = [B, C] Visit C (top of stack)
Step 3: Stack = [B, F] Visit F
Step 4: Stack = [B] Dead end, backtrack
...Comparison: BFS vs DFS¶
| Aspect | BFS | DFS |
|---|---|---|
| Frontier Shape | Wide, shallow | Deep, narrow |
| Memory | O(b^d) - HIGH | O(bm) - LOW |
| Optimal | Yes (equal costs) | No |
| Complete | Yes | No (with cycles) |
| Best For | Shallow goals | Deep solutions, memory limited |
2.2.3 Uniform-Cost Search (UCS)¶
UCS generalizes BFS to handle weighted graphs where actions have different costs.
Selection Strategy: Priority queue ordered by path cost g(n)
Key Idea: Always expand the node with the lowest path cost from start.
Path Cost: g(n) = sum of edge costs from start to node n
How it works:
Maintain priority queue of nodes, ordered by g(n)
Always expand node with smallest g(n)
Update costs when shorter paths are found
Properties:
| Property | Value |
|---|---|
| Complete | Yes |
| Optimal | Yes |
| Time Complexity | O(b^(C*/ε)) |
| Space Complexity | O(b^(C*/ε)) |
Where:
C*: cost of optimal solution
ε: minimum edge cost
Example:
Graph with costs:
S --1--> A --2--> G (total: 3)
S --2--> B --1--> G (total: 3)
S --4--> G (total: 4)
UCS explores in order:
1. S (g=0)
2. A (g=1)
3. B (g=2)
4. G (g=3) ← Found via A or B2.2.4 Iterative Deepening DFS (IDDFS)¶
IDDFS combines the space efficiency of DFS with the optimality of BFS.
Key Idea: Run DFS with increasing depth limits.
Algorithm:
for depth_limit = 0, 1, 2, ... do
result = DFS with depth limit
if result found then return resultHow it works:
Try DFS with depth limit 0
If no solution, try depth limit 1
Keep increasing limit until solution found
Properties:
| Property | Value |
|---|---|
| Complete | Yes |
| Optimal | Yes (equal costs) |
| Time Complexity | O(b^d) |
| Space Complexity | O(bd) |
Why it’s efficient:
Most work is at the deepest level
Overhead from revisiting shallow nodes is small
Gets BFS’s optimality with DFS’s memory
Example: Finding goal at depth 2
Depth limit 0: Check root only
Depth limit 1: Check root and its children
Depth limit 2: Check up to depth 2 → Goal found!2.2.5 Bidirectional Search¶
Run two simultaneous searches: one forward from start, one backward from goal. Stop when they meet.
Key Idea: Two small searches are faster than one large search.
Complexity Improvement:
Normal BFS: O(b^d)
Bidirectional: O(2 × b^(d/2)) = O(b^(d/2))
Example: If b=10, d=6:
Normal: 10^6 = 1,000,000 nodes
Bidirectional: 2 × 10^3 = 2,000 nodes
Requirements:
Must be able to generate predecessors
Must be able to check if nodes from both searches match
Challenges:
Requires goal state to be explicitly known
Need to detect when searches meet
More complex implementation
Summary: Uninformed Search Algorithms¶
| Algorithm | Complete | Optimal | Time | Space | When to Use |
|---|---|---|---|---|---|
| BFS | Yes | Yes* | O(b^d) | O(b^d) | Shallow goals, all costs equal |
| DFS | No | No | O(b^m) | O(bm) | Memory limited, deep solutions |
| UCS | Yes | Yes | O(b^(C*/ε)) | O(b^(C*/ε)) | Weighted graphs |
| IDDFS | Yes | Yes* | O(b^d) | O(bd) | Unknown depth, memory limited |
| Bidirectional | Yes | Yes* | O(b^(d/2)) | O(b^(d/2)) | Know both start and goal |
*Optimal only for equal-cost actions
2.3 Informed Search Algorithms¶
Informed search uses problem-specific knowledge through heuristic functions to guide search more efficiently toward goals.
Heuristic Function h(n)¶
Definition: Estimated cost from node n to the nearest goal.
Example (8-puzzle):
h(n) = number of misplaced tiles
h(n) = sum of Manhattan distances of tiles
Key Properties:
Admissible: Never overestimates true cost
h(n) ≤ h*(n) where h*(n) is true cost to goalConsistent (Monotonic): Satisfies triangle inequality
h(n) ≤ c(n,n’) + h(n’) for every edge (n,n’)
Evaluation Functions¶
g(n): Actual cost from start to n
h(n): Estimated cost from n to goal
f(n): Total estimated cost of solution through n
2.3.1 Greedy Best-First Search¶
Evaluation Function: f(n) = h(n)
Strategy: Expand node that appears closest to goal.
How it works:
Maintain priority queue ordered by h(n)
Always expand node with smallest h(n)
Ignore cost to reach current node
Properties:
| Property | Value |
|---|---|
| Complete | No |
| Optimal | No |
| Time | O(b^m) |
| Space | O(b^m) |
Advantages:
Often finds solutions quickly
Simple to implement
Disadvantages:
Can be misled by heuristic
Not guaranteed to find optimal solution
Can get stuck in loops
Example: Romania routing
Goal: Bucharest
h(Arad) = 366
h(Sibiu) = 253
h(Fagaras) = 176
Greedy always picks node with smallest h, even if path is longer!2.3.2 A* Search¶
A* is the most widely used informed search algorithm, combining actual and estimated costs.
Evaluation Function:
Where:
g(n): Actual cost from start to n
h(n): Estimated cost from n to goal
f(n): Estimated total cost of solution through n
How it works:
Maintain priority queue ordered by f(n) = g(n) + h(n)
Always expand node with smallest f(n)
Balance between:
Nodes close to start (small g)
Nodes close to goal (small h)
Properties (when h is admissible):
| Property | Value |
|---|---|
| Complete | Yes |
| Optimal | Yes |
| Time | Exponential |
| Space | O(b^d) |
Why A* is Optimal¶
Theorem: If h(n) is admissible, A* finds the optimal solution.
Proof Sketch:
Suppose A* returns suboptimal goal G₂ with cost C₂
Let G be optimal goal with cost C* < C₂
Some unexpanded node n must be on optimal path to G
Since h is admissible:
f(n) = g(n) + h(n) ≤ g(n) + h*(n) = C*
Therefore: f(n) ≤ C* < C₂ = f(G₂)
But A* expands nodes in order of f-value
So n would be expanded before G₂
Contradiction! ∎
2.3.3 Designing Heuristics¶
Good heuristics make search efficient. Here are examples for common problems.
Example 1: 8-Puzzle¶
Problem: Slide tiles to reach goal configuration.
Heuristic 1: Misplaced Tiles
Admissible? Yes (each misplaced tile requires ≥1 move)
Quality: Moderate
Heuristic 2: Manhattan Distance
Admissible? Yes (tiles can’t jump, need ≥ Manhattan distance moves)
Quality: Better than h₁
Example State:
Current: Goal:
1 2 3 1 2 3
4 5 6 4 5 6
7 8 7 8
h₁ = 1 (tile 8 misplaced)
h₂ = 1 (tile 8 needs 1 move)Empirical Results:
| Heuristic | Avg Nodes Expanded |
|---|---|
| None (UCS) | ~170,000 |
| Misplaced (h₁) | ~500 |
| Manhattan (h₂) | ~50 |
Techniques for Creating Heuristics¶
1. Relaxed Problem
Remove constraints from original problem:
8-puzzle: Allow tiles to move anywhere → Manhattan distance
Route planning: Ignore roads → straight-line distance
2. Pattern Databases
Precompute optimal costs for subproblems:
8-puzzle: Store costs for positioning each tile
Rubik’s cube: Store costs for corner positions
3. Learning
Train a function to estimate costs:
Use neural network
Learn from experience
Properties of Good Heuristics¶
✓ Admissible: Guarantees optimality
✓ Consistent: Enables efficient search
✓ Accurate: Close to true cost (but never over)
✓ Fast to compute: Overhead shouldn’t exceed savings
2.4 Local Search Algorithms¶
Local search operates on complete state configurations, making small modifications to improve an objective function. Unlike path-finding algorithms, local search:
Doesn’t track paths
Only cares about final state quality
Uses state-centric loss functions
Applications:
Scheduling problems
Circuit design
Eight-queens (complete-state approach)
Traveling salesperson problem
2.4.1 Hill Climbing¶
Hill climbing is greedy local search that always moves to the best neighboring state.
Algorithm:
Algorithm: HillClimbing(Initial State s, Loss Function L)
begin
current ← s
loop
neighbors ← generate_neighbors(current)
if neighbors is empty then
return current
next ← neighbor with minimum L(neighbor)
if L(next) ≥ L(current) then
return current // Local optimum
current ← next
endExample: 8-Queens
State: 8 queens on board (may attack)
Neighbor: Move one queen to different square in same column
Loss: Number of pairs of queens attacking each other
Goal: Loss = 0
Problems with Hill Climbing:
Local Maxima: All neighbors worse, but not globally optimal
Plateaus: Flat regions where all neighbors have same value
Ridges: Sequence of local maxima difficult to navigate
Solutions:
Random restart: Try multiple random initial states
Stochastic hill climbing: Choose randomly among uphill moves
First-choice: Take first improvement found
Random walk: Sometimes take random steps
2.4.2 Simulated Annealing¶
Simulated annealing allows occasional uphill moves to escape local optima, controlled by a temperature parameter.
Inspired by: Metallurgy - slowly cooling metal to minimize defects
Key Idea: Accept bad moves with probability that decreases over time.
Algorithm:
Algorithm: SimulatedAnnealing(Initial State s, Loss L, Schedule T(t))
begin
current ← s
for t = 1 to ∞ do
T ← T(t) // Temperature decreases with time
if T = 0 then return current
next ← random neighbor of current
ΔL ← L(next) - L(current)
if ΔL < 0 then
current ← next // Always accept improvement
else
// Accept with probability
with probability e^(-ΔL/T) do
current ← next
endAcceptance Probability:
Temperature Schedule:
High T: Accept bad moves frequently (explore widely)
Low T: Accept bad moves rarely (exploit current region)
T → 0: Becomes hill climbing
Common Schedules:
Linear: T(t) = T₀ - αt
Exponential: T(t) = T₀ × γᵗ (γ < 1)
Logarithmic: T(t) = T₀ / log(t)
Properties:
✓ Can escape local optima
✓ Guaranteed to find global optimum (with slow enough cooling)
✓ Simple to implement
✗ Requires tuning temperature schedule
✗ May be slow to converge
2.4.3 Tabu Search¶
Tabu search maintains a tabu list of recently visited states to avoid cycles.
Key Idea: Remember recent moves and forbid returning to them.
Algorithm:
Algorithm: TabuSearch(Initial State s, Loss L, Tabu Size k)
begin
current ← s
best ← s
tabu_list ← empty list of size k
loop
neighbors ← generate_neighbors(current)
// Exclude tabu neighbors (unless aspiration criteria met)
allowed ← neighbors not in tabu_list
if allowed is empty then break
next ← best neighbor from allowed
current ← next
// Update tabu list (circular buffer)
add current to tabu_list
if L(current) < L(best) then
best ← current
return best
endFeatures:
Tabu List: Fixed-size memory of recent moves
Aspiration Criteria: Override tabu if solution is best so far
Diversification: Periodically jump to unexplored regions
Intensification: Focus search on promising regions
Advantages:
Escapes local optima by forbidding backtracking
Often finds better solutions than hill climbing
Balances exploration and exploitation
Disadvantages:
Requires tuning tabu list size
More complex than simpler methods
Comparison: Local Search Methods¶
| Method | Escape Local Optima? | Memory | Best For |
|---|---|---|---|
| Hill Climbing | No | None | Simple problems, quick solutions |
| Random Restart | Yes | None | Many local optima |
| Simulated Annealing | Yes (probabilistic) | None | Large state spaces |
| Tabu Search | Yes (deterministic) | Tabu list | Avoiding cycles, structured problems |
| Genetic Algorithm | Yes | Population | Complex optimization |
2.5 Genetic Algorithms¶
Genetic algorithms apply evolutionary principles to search and optimization.
Core Concepts¶
Chromosome: Solution encoded as string
Population: Set of candidate solutions
Fitness: Quality measure of each solution
Selection: Choose parents based on fitness
Crossover: Combine parents to create offspring
Mutation: Random changes for diversity
Algorithm:
Algorithm: GeneticAlgorithm(Population Size N, Generations T)
begin
population ← random initialization of N chromosomes
for generation = 1 to T do
// Evaluate
for each chromosome in population do
fitness[chromosome] ← evaluate(chromosome)
// Create new population
new_population ← empty
for i = 1 to N/2 do
// Selection (biased by fitness)
parent1 ← select(population, fitness)
parent2 ← select(population, fitness)
// Crossover
child1, child2 ← crossover(parent1, parent2)
// Mutation
child1 ← mutate(child1)
child2 ← mutate(child2)
add child1, child2 to new_population
population ← new_population
return best chromosome from population
endExample: TSP with GAs
Chromosome: Permutation of cities [3,1,4,2,5]
Fitness: Negative of tour length
Crossover: Order crossover (preserve relative order)
Mutation: Swap two cities
2.6 Constraint Satisfaction Problems¶
Constraint Satisfaction Problem (CSP) is a special class of search problems defined by:
Variables: X₁, X₂, ..., Xₙ
Domains: D₁, D₂, ..., Dₙ (possible values for each variable)
Constraints: Relations that must be satisfied
Examples:
Sudoku: Fill grid so constraints satisfied
Map coloring: Color regions so neighbors differ
Scheduling: Assign times without conflicts
Backtracking Search for CSPs¶
Standard approach: Depth-first search with constraint checking.
Algorithm:
Algorithm: BacktrackingSearch(CSP, assignment)
begin
if assignment is complete then
return assignment
var ← select unassigned variable
for each value in domain[var] do
if value is consistent with assignment then
add {var = value} to assignment
result ← BacktrackingSearch(CSP, assignment)
if result ≠ failure then
return result
remove {var = value} from assignment
return failure
endInference: Forward Checking¶
After assigning a variable, remove inconsistent values from domains of unassigned variables.
Example: Map Coloring
Assign WA=red
→ Remove red from NT, SA (neighbors of WA)
If any domain becomes empty → backtrack immediately!Heuristics¶
Variable Ordering:
MRV (Minimum Remaining Values): Choose variable with fewest legal values
Degree Heuristic: Choose variable involved in most constraints
Value Ordering:
LCV (Least Constraining Value): Choose value that rules out fewest choices for neighbors
Programming Tasks¶
Complete implementations with visualizations are provided in Chapter 2: Search Algorithms - Implementation.
Task 1: Implement BFS and DFS (Easy)¶
Implement both algorithms and compare on a maze problem:
Number of nodes expanded
Path length found
Memory usage
Task 2: A* with Custom Heuristic (Medium)¶
Implement A* for 8-puzzle with:
Manhattan distance heuristic
Misplaced tiles heuristic
Compare performance
Task 3: Hill Climbing vs Simulated Annealing (Medium)¶
Solve 8-queens problem with both methods:
Measure success rate
Compare number of moves
Visualize search trajectory
Task 4: Genetic Algorithm for TSP (Hard)¶
Implement GA for Traveling Salesperson:
Design appropriate crossover operator
Tune mutation rate
Compare with hill climbing
Task 5: CSP Solver (Hard)¶
Build a Sudoku solver using:
Backtracking with forward checking
MRV and LCV heuristics
Measure speedup from heuristics
Summary¶
This chapter covered search algorithms for AI problems:
Key Concepts¶
State Space Representation:
Model problems as graphs
States as nodes, actions as edges
Uninformed Search:
BFS: Complete, optimal (equal costs), high memory
DFS: Memory efficient, not complete/optimal
UCS: Handles weighted graphs, optimal
IDDFS: Best of both worlds
Informed Search:
Heuristics guide search
A*: Optimal with admissible heuristics
Good heuristics = dramatic speedup
Local Search:
Hill climbing: Fast but gets stuck
Simulated annealing: Escapes local optima
Tabu search: Uses memory to avoid cycles
Genetic Algorithms:
Population-based optimization
Selection, crossover, mutation
Constraint Satisfaction:
Special structure enables efficient solving
Backtracking with inference
Algorithm Selection Guide¶
| Problem Type | Recommended Algorithm |
|---|---|
| Shortest path, unweighted | BFS |
| Shortest path, weighted | A* or UCS |
| Memory limited | IDDFS or DFS |
| Optimization, no path needed | Local search |
| Many local optima | Simulated annealing or GA |
| Constraint satisfaction | Backtracking with inference |
Looking Ahead¶
Next chapter: Multi-agent search - Game playing with adversarial opponents, minimax, alpha-beta pruning, Monte Carlo tree search.
Further Reading¶
Textbooks¶
Aggarwal, C. C. (2021). Artificial Intelligence: A Textbook. Springer. [Chapter 2]
Russell, S., & Norvig, P. (2020). Artificial Intelligence: A Modern Approach (4th ed.). [Chapters 3-4]
Classic Papers¶
Hart, P. E., et al. (1968). A Formal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE Transactions on Systems Science and Cybernetics.
Kirkpatrick, S., et al. (1983). Optimization by Simulated Annealing. Science, 220(4598).
Online Resources¶
Red Blob Games - Pathfinding - Interactive A* tutorial
PathFinding.js Visual - Algorithm visualizations
Implementations¶
See Chapter 2: Search Algorithms - Implementation for complete Python implementations with examples and visualizations.