Unit 2
Unit 2
Adversarial search is widely used in game-playing, strategic planning, and competitive decision-
making, where the success of one agent often comes at the expense of another. This discussion
explains the core ideas, major algorithms, and real-world applications of adversarial search in a clear
and structured manner.
A gain for one player results in an equal loss for the other.
In such environments, each agent must consider not only its own actions but also the possible
responses of the opponent. The goal of adversarial search is to determine an optimal strategy by
examining all feasible moves and counter-moves.
In Artificial Intelligence, adversarial search plays a crucial role in enabling agents to make rational
and optimal decisions while assuming that the opponent is also acting intelligently and strategically.
Each agent attempts to maximize its own utility or minimize its loss.
The action of one agent directly affects the outcome for other agents.
Strategic uncertainty exists because agents may not fully know the opponent’s future actions
or strategies.
1. Game Playing
Adversarial search is extensively used in classic and modern games such as Chess, Go, Checkers, Tic-
Tac-Toe, and Poker. These games provide a structured environment where:
AI agents use adversarial search to evaluate future game states and select moves that lead to the
best possible outcome against an optimal opponent.
2. Decision-Making
This makes adversarial search valuable not only in games but also in economic models, security
systems, and strategic planning problems.
Traditional search algorithms such as Depth-First Search (DFS), Breadth-First Search (BFS), and A*
are well-suited for single-agent problems, where the environment does not actively oppose the
agent’s goal.
However, in zero-sum games and competitive environments, these algorithms are insufficient
because:
To handle such situations, specialized adversarial search algorithms are used, primarily:
Minimax Algorithm
Alpha–Beta Pruning
Minimax Algorithm
The Minimax algorithm is the fundamental adversarial search technique used in two-player games.
One player is called MAX, who tries to maximize the utility value.
The opponent is called MIN, who tries to minimize the utility value.
MAX selects the move with the highest minimum payoff, while MIN selects the move with
the lowest maximum loss.
Minimax systematically explores the game tree and ensures optimal decision-making, but it can be
computationally expensive for large games.
Alpha–Beta Pruning
Alpha–Beta pruning produces the same result as minimax while significantly reducing the number of
nodes evaluated, making it practical for complex games like chess.
This implementation highlights how adversarial search works in real-world, turn-based games.
Legal moves are actions that transfer from one state to another.
A game ends at terminal states with a numerical utility value.
In a normal search problem, you look for the best sequence of moves to reach a goal. But in an
adversarial (competitive) setting:
o Then MAX’s move after every possible counter-reply by MIN, and so on.
This ensures that the strategy performs at least as well as any other even when the opponent plays
perfectly.
The minimax rule chooses the action for MAX that ensures the best worst-case outcome.
Initial call:
Minimax(node, 3, true)
Step-1: In the first step, the algorithm generates the entire game-tree and apply the
utility function to get the utility values for the terminal states. In the below tree
diagram, let's take A is the initial state of the tree. Suppose maximizer takes first turn
which has worst-case initial value =- infinity, and minimizer will take next turn which
has worst-case initial value = +infinity.
Step 2: Now, first we find the utilities value for the Maximizer, its initial value is -∞, so
we will compare each value in terminal state with initial value of Maximizer and
determines the higher nodes values. It will find the maximum among the all.
Step 4: Now it's a turn for Maximizer, and it will again choose the maximum of all
nodes value and find the maximum value for the root node. In this game tree, there
are only 4 layers, hence we reach immediately to the root node, but in real games,
there will be more than 4 layers.
That was the complete workflow of the minimax two player game.
The main drawback of the minimax algorithm is that it gets really slow for complex games
such as Chess, go, etc. This type of games has a huge branching factor, and the player has
lots of choices to decide. This limitation of the minimax algorithm can be improved from
alpha-beta pruning which we have discussed in the next topic.
Alpha-Beta Pruning
o Alpha-beta pruning is a modified version of the minimax algorithm. It is an
optimization technique for the minimax algorithm.
o As we have seen in the minimax search algorithm that the number of game
states it has to examine are exponential in depth of the tree. Since we cannot
eliminate the exponent, but we can cut it to half. Hence there is a technique by
which without checking each node of the game tree we can compute the
correct minimax decision, and this technique is called pruning. This involves
two threshold parameter Alpha and beta for future expansion, so it is called
alpha- beta pruning. It is also called as Alpha-Beta Algorithm.
o Alpha-beta pruning can be applied at any depth of a tree, and sometimes it not
only prune the tree leaves but also entire sub-tree.
o The two-parameter can be defined as:
a. Alpha: The best (highest-value) choice we have found so far at any
point along the path of Maximizer. The initial value of alpha is -∞.
b. Beta: The best (lowest-value) choice we have found so far at any point
along the path of Minimizer. The initial value of beta is +∞.
o The Alpha-beta pruning to a standard minimax algorithm returns the same
move as the standard algorithm does, but it removes all the nodes which are
not really affecting the final decision but making algorithm slow. Hence by
pruning these nodes, it makes the algorithm fast.
Note: To better understand this topic, kindly study the minimax algorithm.
1. α>=β
o We will only pass the alpha, beta values to the child nodes.
Pseudo-code for Alpha-beta Pruning:
Let's take an example of two-player search tree to understand the working of Alpha-
beta pruning
Step 1: At the first step the, Max player will start first move from node A where α= -∞
and β= +∞, these value of alpha and beta passed down to node B where again α= -∞
and β= +∞, and Node B passes the same value to its child D.
Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α
is compared with firstly 2 and then 3, and the max (2, 3) = 3 will be the value of α at
node D and node value will also 3.
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is
a turn of Min, Now β= +∞, will compare with the available subsequent nodes value,
i.e. min (∞, 3) = 3, hence at node B now α= -∞, and β= 3.
In the next step, algorithm traverse the next successor of Node B which is node E, and
the values of α= -∞, and β= 3 will also be passed.
Step 4: At node E, Max will take its turn, and the value of alpha will change. The
current value of alpha will be compared with 5, so max (-∞, 5) = 5, hence at node E α=
5 and β= 3, where α>=β, so the right successor of E will be pruned, and algorithm will
not traverse it, and the value at node E will be 5.
Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At
node A, the value of alpha will be changed the maximum available value is 3 as max (-
∞, 3)= 3, and β= +∞, these two values now passes to right successor of A which is
Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, and
max(3,0)= 3, and then compared with right child which is 1, and max(3,1)= 3 still α
remains 3, but the node value of F will become 1.
Step 7: Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here the
value of beta will be changed, it will compare with 1 so min (∞, 1) = 1. Now at C, α=3
and β= 1, and again it satisfies the condition α>=β, so the next child of C which is G
will be pruned, and the algorithm will not compute the entire sub-tree G.
Step 8: C now returns the value of 1 to A here the best value for A is max (3, 1) = 3.
Following is the final game tree which is the showing the nodes which are computed
and nodes which has never computed. Hence the optimal value for the maximizer is 3
for this example.
o Worst ordering: In some cases, alpha-beta pruning algorithm does not prune
any of the leaves of the tree, and works exactly as minimax algorithm. In this
case, it also consumes more time because of alpha-beta factors, such a move
of pruning is called worst ordering. In this case, the best move occurs on the
right side of the tree. The time complexity for such an order is O(bm).
o Ideal ordering: The ideal ordering for alpha-beta pruning occurs when lots of
pruning happens in the tree, and best moves occur at the left side of the tree.
We apply DFS hence it first search left of the tree and go deep twice as
minimax algorithm in the same amount of time. Complexity in ideal ordering is
O(bm/2).
Rules to find good ordering:
where:
o ( b ) = branching factor
o ( d ) = depth of the tree
For example:
In Chess, the average branching factor is about 35
A typical game depth is around 80 moves
This makes complete exploration of the game tree practically impossible.
2. Real-Time Constraints
In many competitive environments:
The agent must respond within strict time limits
Delayed decisions can lead to immediate failure
Examples include:
Chess engines operating under clock constraints
Online multiplayer games
Real-time simulations and competitions
Thus, the agent cannot afford to wait for a perfectly computed solution.
3. Limited Resources
Even powerful systems face:
Memory limitations
CPU time restrictions
Energy constraints
As the search depth increases, resource consumption grows rapidly, forcing the AI to
terminate search early.
4. Uncertainty and Dynamic Play
In practice:
Opponents may behave unpredictably
Game situations evolve rapidly
New information may become available after each move
These factors further reduce the feasibility of perfect decision-making.
Nature of Imperfect Decisions
Imperfect real-time decisions exhibit the following properties:
They are bounded by time
They are heuristic-driven
They are context-dependent
They prioritize responsiveness over optimality
Such decisions aim to be good enough rather than theoretically perfect.
Search Cutoffs and Their Role
Depth Cutoff
A depth cutoff limits the search to a predefined depth ( d ).
When the cutoff is reached:
The algorithm stops expanding further nodes
A heuristic evaluation is applied to the current state
This avoids the cost of exploring deeper levels of the game tree.
Time Cutoff
In time-critical systems:
Search continues only until the time limit expires
The best move identified so far is selected
Time cutoffs are essential in real-time environments.
Evaluation Functions in Real-Time Decisions
When terminal states are not reached, the agent must estimate the desirability of a game
state.
An evaluation function:
Assigns a numerical value to non-terminal states
Approximates the true utility of a state
Is designed using domain knowledge
Characteristics of a Good Evaluation Function
1. Computationally efficient
2. Correlated with actual winning chances
3. Sensitive to strategic advantages
Example: Chess Evaluation Function
A chess evaluation function may consider:
Material balance
King safety
Piece mobility
Pawn structure
Control of key squares
Each factor is weighted and combined to produce a final score.
Iterative Deepening in Real-Time Decision Making
Iterative deepening allows the agent to:
Perform multiple depth-limited searches
Gradually increase search depth
Always maintain a valid decision
If time runs out during deeper search, the agent still has:
A reliable move from a previous iteration
This makes iterative deepening especially suitable for real-time systems.
Alpha–Beta Pruning for Efficiency
Alpha–beta pruning significantly reduces the number of nodes evaluated by:
Discarding branches that cannot influence the final decision
Allowing deeper exploration within the same time frame
This is critical for improving decision quality under real-time constraints.
Real-World Examples
1. Chess and Board Game Engines
Modern chess engines:
Combine depth-limited minimax
Use advanced evaluation functions
Employ alpha–beta pruning and iterative deepening
Despite imperfect decisions, they consistently outperform human players.
2. Real-Time Strategy (RTS) Games
In RTS games:
Decisions must be made continuously
Long-term planning is combined with immediate reactions
Perfect optimization is impossible
Thus, fast heuristic decisions are preferred.
3. Autonomous and Interactive Agents
Robots and AI agents:
Must act immediately in uncertain environments
Rely on approximate reasoning
Adjust decisions dynamically
Advantages of Imperfect Real-Time Decisions
Enables real-time interaction
Scales to complex environments
Balances speed and intelligence
Makes AI systems practical
Limitations
No guarantee of optimal outcomes
Strong dependence on heuristic quality
Risk of strategic errors in complex situations
A Constraint Satisfaction Problem is a mathematical problem where the solution must meet
a number of constraints. In CSP the objective is to assign values to variables such that all the
constraints are satisfied. Many AI applications use CSPs to solve decision-making problems
that involve managing or arranging resources under strict guidelines. Common applications
of CSPs include:
Scheduling: It assigns resources like employees or equipment while respecting time
and availability constraints.
Planning: Organize tasks with specific deadlines or sequences.
Resource Allocation: Distributing resources efficiently without overuse
Components of Constraint Satisfaction Problems
CSPs are composed of three key elements:
1. Variables: These are the things we need to find values for. Each variable represents
something that needs to be decided. For example, in a Sudoku puzzle each empty cell is a
variable that needs a number. Variables can be of different types like yes/no choices
(Boolean), whole numbers (integers) or categories like colors or names.
2. Domains: This is the set of possible values that a variable can have. The domain tells us
what values we can choose for each variable. In Sudoku the domain for each cell is the
numbers 1 to 9 because each cell must contain one of these numbers. Some domains are
small and limited while others can be very large or even infinite.
3. Constraints: These are the rules that restrict how variables can be assigned values.
Constraints define which combinations of values are allowed. There are different types of
constraints:
Unary constraints apply to a single variable like "this cell cannot be 5".
Binary constraints involve two variables like "these two cells cannot have the same
number".
Higher-order constraints involve three or more variables like "each row in Sudoku
must have all numbers from 1 to 9 without repetition".
Types of Constraint Satisfaction Problems
CSPs can be classified into different types based on their constraints and problem
characteristics:
1. Binary CSPs: In these problems each constraint involves only two variables. Like in a
scheduling problem the constraint could specify that task A must be completed
before task B.
2. Non-Binary CSPs: These problems have constraints that involve more than two
variables. For instance in a seating arrangement problem a constraint could state that
three people cannot sit next to each other.
3. Hard and Soft Constraints: Hard constraints must be strictly satisfied while soft
constraints can be violated but at a certain cost. This is often used in real-world
applications where not all constraints are equally important.
Representation of Constraint Satisfaction Problems (CSP)
In CSP it involves the interaction of variables, domains and constraints. Below is a structured
representation of how CSP is formulated:
1. Finite Set of Variables (V1,V2,...,Vn)(V1,V2,...,Vn): The problem consists of a set of
variables each of which needs to be assigned a value that satisfies the given
constraints.
2. Non-Empty Domain for Each Variable (D1,D2,...,Dn)(D1,D2,...,Dn): Each variable has
a domain a set of possible values that it can take. For example, in a Sudoku puzzle the
domain could be the numbers 1 to 9 for each cell.
3. Finite Set of Constraints (C1,C2,...,Cm)(C1,C2,...,Cm): Constraints restrict the possible
values that variables can take. Each constraint defines a rule or relationship between
variables.
4. Constraint Representation: Each constraint CiCi is represented as a pair of (scope,
relation) where:
Scope: The set of variables involved in the constraint.
Relation: A list of valid combinations of variable values that satisfy the
constraint.
Example: Let’s say you have two variables V1V1 and V2V2. A possible constraint could
be V1≠V2V1=V2, which means the values assigned to these variables must not be equal.
There:
Scope: The variables V1V1 and V2V2.
Relation: A list of valid value combinations where V1V1 is not equal to V2V2.
Some relations might include explicit combinations while others may rely on abstract
relations that are tested for validity dynamically.
Solving Constraint Satisfaction Problems Efficiently
CSP use various algorithms to explore and optimize the search space ensuring that solutions
meet the specified constraints. Here’s a breakdown of the most commonly used CSP
algorithms:
1. Backtracking Algorithm
The backtracking algorithm is a depth-first search method used to systematically explore
possible solutions in CSPs. It operates by assigning values to variables and backtracks if any
assignment violates a constraint.
How it works:
The algorithm selects a variable and assigns it a value.
It recursively assigns values to subsequent variables.
If a conflict arises i.e a variable cannot be assigned a valid value then algorithm
backtracks to the previous variable and tries a different value.
The process continues until either a valid solution is found or all possibilities have
been exhausted.
This method is widely used due to its simplicity but can be inefficient for large problems with
many variables.
2. Forward-Checking Algorithm
The forward-checking algorithm is an enhancement of the backtracking algorithm that aims
to reduce the search space by applying local consistency checks.
How it works:
For each unassigned variable the algorithm keeps track of remaining valid values.
Once a variable is assigned a value local constraints are applied to neighboring
variables and eliminate inconsistent values from their domains.
If a neighbor has no valid values left after forward-checking the algorithm backtracks.
This method is more efficient than pure backtracking because it prevents some conflicts
before they happen reducing unnecessary computations.
3. Constraint Propagation Algorithms
Constraint propagation algorithms further reduce the search space by enforcing local
consistency across all variables.
How it works:
Constraints are propagated between related variables.
Inconsistent values are eliminated from variable domains by using information
gained from other variables.
These algorithms filter the search space by making inferences and by remove values
that would led to conflicts.
Constraint propagation is used along with other CSP methods like backtracking to make the
search faster.
Solving Sudoku with Constraint Satisfaction Problem (CSP) Algorithms
Step 1: Define the Problem (Sudoku Puzzle Setup)
The first step is to define the Sudoku puzzle as a 9x9 grid where 0 represents an empty cell.
We also define a function print_sudoku to display the puzzle in a human readable format.
puzzle = [[5, 3, 0, 0, 7, 0, 0, 0, 0],
[6, 0, 0, 1, 9, 5, 0, 0, 0],
[0, 9, 8, 0, 0, 0, 0, 6, 0],
[8, 0, 0, 0, 6, 0, 0, 0, 3],
[4, 0, 0, 8, 0, 3, 0, 0, 1],
[7, 0, 0, 0, 2, 0, 0, 0, 6],
[0, 6, 0, 0, 0, 0, 2, 8, 0],
[0, 0, 0, 4, 1, 9, 0, 0, 5],
[0, 0, 0, 0, 8, 0, 0, 7, 9]]
def print_sudoku(puzzle):
for i in range(9):
if i % 3 == 0 and i != 0:
print("- - - - - - - - - - - ")
for j in range(9):
if j % 3 == 0 and j != 0:
print(" | ", end="")
print(puzzle[i][j], end=" ")
print()
var = self.select_unassigned_variable(assignment)
for value in self.order_domain_values(var, assignment):
if self.is_consistent(var, value, assignment):
assignment[var] = value
result = [Link](assignment)
if result is not None:
return result
del assignment[var]
return None
domains = {
var: set(range(1, 10)) if puzzle[var[0]][var[1]] == 0 else {puzzle[var[0]][var[1]]}
for var in variables
}
constraints = {}
def add_constraint(var):
constraints[var] = []
for i in range(9):
if i != var[0]:
constraints[var].append((i, var[1]))
if i != var[1]:
constraints[var].append((var[0], i))
sub_i, sub_j = var[0] // 3, var[1] // 3
for i in range(sub_i * 3, (sub_i + 1) * 3):
for j in range(sub_j * 3, (sub_j + 1) * 3):
if (i, j) != var:
constraints[var].append((i, j))
Output
Applications of CSPs in AI
CSPs are used in many fields because they are flexible and can solve real-world problems
efficiently. Here are some common applications:
1. Scheduling: They help in planning things like employee shifts, flight schedules and
university timetables. The goal is to assign tasks while following rules like time limits,
availability and priorities.
2. Puzzle Solving: Many logic puzzles such as Sudoku, crosswords and the N-Queens
problem can be solved using CSPs. The constraints make sure that the puzzle rules
are followed.
3. Configuration Problems: They help in selecting the right components for a product
or system. For example when building a computer it ensure that all selected parts are
compatible with each other.
4. Robotics and Planning: Robots use CSPs to plan their movements, avoid obstacles
and complete task efficiently. For example a robot navigating a warehouse must
avoid crashes and minimize energy use.
5. Natural Language Processing (NLP): In NLP they help with tasks like breaking
sentences into correct grammatical structures based on language rules.
Benefits of CSPs in AI
1. Standardized Representation: They provide a clear and structured way to define
problems using variables, possible values and rules.
2. Efficiency: Smart search techniques like backtracking and forward-checking help to
reduce the time needed to find solutions.
3. Flexibility: The same CSP methods can be used in different areas without needing
expert knowledge in each field.
Challenges in Solving CSPs
1. Scalability: When there are too many variables and rules the problem becomes very
complex and finding a solution can take too long.
2. Changing Problems (Dynamic CSPs): In real life conditions and constraints can
change over time requiring the CSP solution to be updated.
3. Impossible or Overly Strict Problems: Sometimes a CSP may not have a solution
because the rules are too strict. In such cases adjustments or compromises may be
needed to find an acceptable solution.
Propositional Logic
Knowledge-Based Agents
KB = knowledge base
A set of sentences or facts
e.g., a set of statements in a logic language
Inference
Deriving new sentences from old
e.g., using a set of logical statements to infer new ones
A simple model for reasoning
Agent is told or perceives new evidence
E.g., A is true
Agent then infers new facts to add to the KB
E.g., KB = { A -> (B OR C) }, then given A and not C we can infer that B is true
B is now added to the KB even though it was not explicitly asserted, i.e., the agent inferred B
In Artificial Intelligence, intelligent behavior requires more than just reacting to immediate
inputs. Many AI systems must store knowledge, reason logically, and make decisions based
on past and present information. Such systems are known as Knowledge-Based Agents.
A Knowledge-Based Agent is an agent that maintains an internal knowledge base (KB)
containing information about the world and uses logical inference to derive new knowledge
and decide appropriate actions.
Knowledge Base (KB)
Definition
A Knowledge Base (KB) is a structured collection of facts, rules, and assertions about the
environment. These are represented in a formal language, usually a logic-based language.
Formally:
KB = a set of sentences expressed in a logical representation language
These sentences describe:
What is known to be true about the world
Relationships between objects
Conditional rules that describe cause and effect
Nature of Knowledge in KB
The knowledge stored in a KB may include:
Facts (e.g., “A is true”)
Rules (e.g., “If A is true, then B or C is true”)
Constraints about the environment
This explicit storage of knowledge allows the agent to reason instead of merely reacting.
Inference in Knowledge-Based Agents
What is Inference?
Inference is the process of deriving new facts from existing knowledge using logical
reasoning.
In simple terms:
Inference enables an agent to reach conclusions that are not directly stated but logically
follow from known facts.
Inference is performed using predefined logical inference rules, such as:
Modus Ponens
Resolution
Forward and backward chaining
Reasoning Process of a Knowledge-Based Agent
The reasoning process of a KB agent follows a structured cycle:
1. Perception of new information
The agent receives new evidence from the environment through sensors.
2. Updating the Knowledge Base
The perceived information is converted into logical statements and added to the KB.
3. Inference
The agent applies inference rules to derive new facts from existing knowledge.
4. Decision Making
Based on inferred knowledge, the agent decides the best possible action.
5. Action Execution
The agent performs the selected action using actuators.
This cycle continues throughout the agent’s operation.
Example of Knowledge-Based Reasoning
Consider the following rule stored in the knowledge base:
KB = { A → (B OR C) }
Now suppose the agent perceives:
A is true
C is false
Step-by-Step Reasoning
The rule states that if A is true, then either B or C must be true
The agent already knows that C is false
Therefore, by logical reasoning, B must be true
Important Insight
The fact B was not explicitly provided
The agent inferred B logically
The inferred fact B is added to the knowledge base
This demonstrates how a Knowledge-Based Agent extends its knowledge through
reasoning.
Significance of Knowledge-Based Agents
Knowledge-Based Agents are important because they:
Can reason about complex and dynamic environments
Can infer hidden or indirect information
Can update knowledge as new evidence appears
Can justify their actions logically
Unlike simple agents, KB agents do not rely only on current perception.
Key Components of a Knowledge-Based Agent
1. Knowledge Base
Stores facts, rules, and assertions.
2. Inference Mechanism
Derives new knowledge from existing information.
3. Perception Module
Accepts new evidence from the environment.
4. Action Module
Executes actions based on logical conclusions.
Features of Knowledge-Based Agents
Explicit representation of knowledge
Logical reasoning capability
Ability to infer new information
Goal-directed behavior
Adaptability to changing environments
Advantages
Intelligent and flexible decision-making
Easy to update or modify knowledge
Suitable for complex problem domains
Decisions can be explained logically
The Cave
Key Elements:
Pits: If the agent steps into a pit it falls and dies. A breeze in adjacent rooms suggests
nearby pits.
Wumpus: A creature that kills agent if it enters its room. Rooms next to the Wumpus
have a stench. Agent can use an arrow to kill the Wumpus.
Treasure: Agent’s main objective is to collect the treasure (gold) which is located in
one room.
Breeze: Indicates a pit is nearby.
Stench: Indicates the Wumpus is nearby.
Agent must navigate carefully avoiding dangers to collect treasure and exit safely.
PEAS Description
PEAS stands for Performance Measures, Environment, Actuators and Sensors which
describe agent’s capabilities and environment.
1. Performance measures: Rewards or Punishments
Agent gets gold and return back safe = +1000 points
Agent dies (pit or Wumpus)= -1000 points
Each move of the agent = -1 point
Agent uses the arrow = -10 points
2. Environment: A setting where everything will take place.
A cave with 16(4x4) rooms.
Rooms adjacent (not diagonally) to the Wumpus are stinking.
Rooms adjacent (not diagonally) to the pit are breezy.
Room with gold glitters.
Agent's initial position - Room[1, 1] and facing right side.
Location of Wumpus, gold and 3 pits can be anywhere except in Room[1, 1].
3. Actuators: Devices that allow agent to perform following actions in the environment.
Move forward: Move to next room.
Turn right/left: Rotate agent 90 degrees.
Shoot: Kill Wumpus with arrow.
Grab: Take treasure.
Release: Drop treasure
4. Sensors: Devices help the agent in sensing following from the environment.
Breeze: Detected near a pit.
Stench: Detected near the Wumpus.
Glitter: Detected when treasure is in the room.
Scream: Triggered when Wumpus is killed.
Bump: Occurs when hitting a wall.
How the Agent Operates with PEAS
1. Perception: Agent uses sensory inputs (breeze, stench, glitter) to detect its
surroundings and understand the environment.
2. Inference: Agent applies logical reasoning to find location of hazards. For example if
it detects a breeze, it warns that a pit is nearby or if there’s a stench it suspects the
Wumpus is in an adjacent room.
3. Planning: Based on its deductions agent plans its next move avoiding risky areas like
rooms with suspected pits or the Wumpus.
4. Action: Agent performs planned action such as moving to a new room, shooting
arrow at the Wumpus or taking the treasure.
This process repeats till the agent finds the cave using its sensory inputs, reasoning and
planning to achieve its goal safely.
By using PEAS framework agent’s interactions with its environment are clearly defined,
providing a structured approach to modeling intelligent behavior.
Propositional Logic
Propositional logic works with statements called propositions that can be true or
false. These propositions represent facts or conditions about a situation. We use symbols to
represent the propositions and logical operations to connect those propositions. It help us
understand how different facts are related to each other in complex statements or problem.
Proposition operators like conjunction (∧), disjunction (∨), negation (¬), implication( →) and
biconditional (↔) helps combine various proposition to represent logical relations.
Example of Propositions Logic
P: "The sky is blue." (This statement can be either true or false.)
Q: "It is raining right now." (This can also be true or false.)
R: "The ground is wet." (This is either true or false.)
These can be combined using logical operations to create more complex statements. For
example:
P ∧ Q: "The sky is blue AND it is raining." (This is true only if both P and Q are true.)
P ∨ Q: "The sky is blue OR it is raining." (This is true if at least one of P or Q is true.)
¬P: "It is NOT true that the sky is blue." (This is true if P is false means the sky is not
blue.)
Logical Equivalence
Two statements are logically equivalent if they always have the same truth values in every
possible situation. For example:
The statement "S → T" (if S then T) is equivalent to "¬S ∨ T" (not S or T). This means
"if S is true, then T must be true" is the same as "either S is false or T is true."
The biconditional "P ↔ Q" (P if and only if Q) is equivalent to "(P → Q) ∧ (Q → P)" (P
implies Q and Q implies P).
These equivalences show that different logical expressions can have the same meaning. You
can verify them using truth tables or by simplifying the statements with logical rules.
Basic Concepts of Propositional Logic
1. Propositions
A proposition is a statement that can either be true or false. It does not matter how
complicated statement is if it can be classified as true or false then it is a proposition. For
example:
"The sky is blue." (True)
"It is raining." (False)
2. Logical Connectives
Logical connectives are used to combine simple propositions into more complex ones. The
main connectives are:
Example: "It is sunny ∧ it is warm" is true only if both "It is sunny" and "It is warm"
AND (∧): This operation is true if both propositions are true.
are true.
Example: "It is sunny ∨ it is raining" is true if either "It is sunny" or "It is raining" is
OR (∨): This operation is true if at least one of the propositions is true.
true.
NOT (¬): This operation reverses the truth value of a proposition.
Example: "¬It is raining" is true if "It is raining" is false.
IMPLIES (→): This operation is true if the first proposition leads to the second.
Example: "If it rains then the ground is wet" (It rains → The ground is wet) is true
unless it rains and the ground is not wet.
IF AND ONLY IF (↔): This operation is true if both propositions are either true or
false together.
Example: "It is raining ↔ The ground is wet" is true if both "It is raining" and "The
ground is wet" are either true or both false.
3. Truth Tables
They are used to find the truth value of complex propositions by checking all possible
combinations of truth values for their components. They systematically list every possible
combinations which helps in making it easy to find how different logical operators affect the
overall outcome. This approach ensures that no combination is given extra importance
which provides a clear and complete picture of the logic at work.
4. Tautologies, Contradictions and Contingencies
Tautology: A proposition that is always true no matter the truth values of the
Example: "P ∨ ¬P" (This is always true because either P is true or P is false).
individual components.
Example: "P ∧ ¬P" (This is always false because P can't be both true and false at the
Contradiction: A proposition that is always false.
same time).
Contingency: A proposition that can be true or false depending on the truth values of
Example: "P ∧ Q" (This is true only if both P and Q are true).
its components.
Properties of Operators
Logical operators in propositional logic have various important properties that help to
simplify and analyze complex statements:
1. Commutativity: Order of propositions doesn’t matter when using AND (∧) or OR (∨).
P∧Q≡Q∧P
P∨Q≡Q∨P
2. Associativity: Grouping of propositions doesn’t matter when using multiple ANDs or ORs.
(P ∧ Q) ∧ R ≡ P ∧ (Q ∧ R)
(P ∨ Q) ∨ R ≡ P ∨ (Q ∨ R)
3. Distributivity: AND (∧) and OR (∨) can distribute over each other which is similar to
multiplication and addition in math.
P ∧ (Q ∨ R) ≡ (P ∧ Q) ∨ (P ∧ R)
P ∨ (Q ∧ R) ≡ (P ∨ Q) ∧ (P ∨ R)
4. Identity: A proposition combined with "True" or "False" behaves predictably.
P ∧ true ≡ P
P ∨ false ≡ P
5. Domination: When combined with "True" or "False" some outcomes are always fixed.
P ∨ true ≡ true
P ∧ false ≡ false
The equations above show all the logical equivalences that can be utilized as inference rules.
The equivalence for biconditional elimination, for example, produces the two inference
rules.
Some inference rules do not function in both directions in the same way. We cannot, for
example, run Modus Ponens in the reverse direction to get α⇒β and α from β.
Let's look at how these equivalences and inference rules may be applied in the wumpus
environment. We begin with the knowledge base including R1 through R5 and demonstrate
how to establish ¬P1,2 i.e. that [1,2] does not include any pits. To generate R6, we first apply
biconditional elimination to R2:
With R8 and the percept R4 (i.e., ¬B1,1 ), we can now apply Modus Ponens to get
Handling Large Rule May become slow with large rule Can be complex to manage
Sets sets with large rule sets
Propositional Logic based Agent
In this article, we will use our understanding to make wumpus world agents that use propositional
logic. The first stage is to enable the agent to deduce the state of the world from its percept history
to the greatest extent possible. This necessitates the creation of a thorough logical model of the
consequences of actions. We also demonstrate how the agent may keep track of the world without
having to return to the percept history for each inference. Finally, we demonstrate how the agent
may develop plans that are guaranteed to meet its objectives using logical inference.
A logical agent works by deducing what to do given a knowledge base of words about the world.
Axioms are the general information about how the universe works combine with percept sentences
gleaned from the agent's experience in a specific reality to form the knowledge base.
Understanding Axioms
We will start with the immutable aspects of the Wumpus world and move on to the mutable aspects
later. For the time being, we will need the following symbols for each [x,y] coordinate:
The sentences we write will be adequate to infer P1,2 (there is no pit in [1,2] labelled). Each sentence
is labeled Ri so that we can refer to it:
A square is breezy if and only if one of its neighbours has a pit. This must be stated for each
square; for the time being, we will only add the relevant squares:
In all Wumpus universes, the previous sentences are correct. The breeze percepts for the
first two squares visited in the specific environment the agent is in are now included.
R4:¬B1,1
R5:B2,1.
The agent is aware that there are no pits (¬P1,1) or Wumpus (¬W1,1) in the starting square. It also
understands that a square is windy if and only if a surrounding square has a pit, and that a square is
stinky if and only if a neighbouring square has a Wumpus. As a result, we include a huge number of
sentences of the following type:
B1,1⇔(P1,2∨P2,1)
S1,1⇔(W1,2∨W2,1)
…
The agent is also aware that there is only one wumpus on the planet. This is split into two sections.
First and foremost, we must state that there is at least one wumpus:
W1,1∨W1,2∨⋯∨W4,3∨W4,4
Then we must conclude that there is only one wumpus. We add a statement to each pair of places
stating that at least one of them must be wumpus-free:
¬W1,1∨¬W1,2
¬W1,1∨¬W1,3
¬W4,3∨¬W4,4
So far, the agent’s reasoning works correctly. Now consider how the agent handles percepts over
time. Suppose the agent perceives a stench at the current moment. We should add this information
to the knowledge base. However, if at the previous time step there was no stench, then ¬Stench
would already exist in the KB, leading to a contradiction.
This issue is resolved by recognizing that percepts describe only the current time step. Therefore,
instead of adding a general proposition like Stench, we associate it with time. For example, at time
step 4, we add Stench₄ rather than Stench. This avoids any conflict with Stench₃. The same idea
applies to other percepts such as Breeze, Bump, Glitter, and Scream.
Any property that changes over time is represented using time-indexed propositions. For example:
FacingEast⁰, HaveArrow⁰, and WumpusAlive⁰ are also part of the initial knowledge base
Using the location fluent, percepts can be connected to properties of specific squares. For every
square [x, y] and time t, we assert:
To model how the world changes, we introduce time-indexed action symbols such as Forward⁰,
TurnLeft⁰, etc. The standard order is:
1. Percept at time t
2. Action at time t
This means that moving forward from [1,1] while facing east places the agent in [2,1] at the next time
step.
Using such axioms, the agent can infer its new position. For example:
However, problems arise when asking about facts that should remain unchanged. For instance, after
moving forward, the agent should still have its arrow. But without additional axioms:
This happens because effect axioms describe what changes, but not what stays the same. This issue
is known as the frame problem.
However, this approach is inefficient. With m actions and n fluents, the number of frame axioms
becomes O(mn). This is called the representational frame problem.
Since each action usually affects only a few fluents, a better approach is to write axioms for each
fluent, rather than for each action. These are called successor-state axioms.
General form:
Because the arrow can only be lost by shooting and never regained, its successor-state axiom
becomes:
This compactly captures both change and persistence, solving the frame problem efficiently.