Chapter -3
Searching
Introduction
Searching is the process of finding out the appropriate or the required state (goal
state) among the possible states. Searching is performed in the state space of a
specified problem. The search process is carried out by constructing a search tree.
Search is a universal problem solving technique. The search involves systematic trial
and error exploration of an alternative solution. Many problems don't have a
simple algorithmic solution. Casting this problem as search problems are often the
easiest way of solving them. It is useful when the sequence of actions required to
solve a problem is not known.
Every problem has these steps to solve:
1. Define the problem: Must include precise initial state & final state.
2. Analyze the problem: Select the most important features that can have an
immense impact.
3. Isolate and represent: Convert these important features into knowledge
representation.
4. Problem solving technique(s): Choose the best technique and apply it to
particular problem.
Some terminologies we need to know for searching technique are described below:
Search space: It is the representation of all possible conditions and solutions.
Initial state: It is the state where the searching process started.
Goal state: It is the state where we target to reach in searching process. .
Problem space: "What to solve
Searching strategy: It is a method by which the search procedure is controlled.
Search tree: The process of showing possible solutions from initial state is called
search tree.
Searching through state space (explicitly using searching tree) involves node visiting
and node expansion. Node visiting is the process of checking the selected node is
goal node or not and node expansion is the process of generating new node related
to previous nodes.
Problem Formulation
A search problem is defined in terms of states, operators and goals.
State
A state is a complete description of the world at certain instance.
I. The initial state is the state the world is in when problem- solving begins.
II. The goal state is a state in which the problem is solved or state meets the
goal criteria. Goal state may have single or multiple possibilities.
In the eight-puzzle game, there is a single solution and a single goal state. In
chess, there are many winning positions and hence, many goal states.
Operator
An operator is an action that transforms one state of the world into another state.
Applicable operators
In general, not all operators can be applied in all state.
In a given chess position, only some moves are legal (as defined by the rules
of chess)
In a given eight puzzle configuration, only some moves are physically
possible,
Performance Metrics for searching process
The output from problem – solving (searching) algorithm is either FAILURE or
SOLUTION. There are four ways to check if the searching process is how efficient
from one another.
i. Completeness: Will the searching technique guarantee to find a solution?
ii. Optimality: Does it find optimal solution among all possible solutions?
iii. Time complexity: How long time will the searching techniques take to
find the solution?
iv. Space complexity: How much memory will the searching techniques
consume to find the solution?
Classification of searching techniques
Searching technique is classified into two main categories namely uninformed
search strategy and informed search strategy.
Uninformed Search Strategy
These types of search strategies are provided with problem definition and these do
not have other additional information about state space. The process of searching
in this method is to expand the current state to get a new state and to distinguish
a goal state from a main goal state. This strategy uses only the information available
in the problem definition, so it is less effective than an informed search.
Breadth First Search
It proceeds level by level down the search tree. Starting from the root node (initial
state) explores all children of the root node, left to right. If no solution is found,
expand the first (leftmost) child of the root node, then expands the second node at
depth 1 and so on.
Process
i. Place the start node in the queue.
ii. Examine the node all the front of the queue.
- If the queue is empty, stop
- If the node is the goal, stop
- Otherwise, add the children of the node to the end of the queue.
In BFS, nodes are searched on a horizontal line and later checked for goal state.
Exploring is done in first in first out manner same as in queue.
Properties of BFS
Completeness: Complete if the goal node is at finite depth.
Optimality: It is guaranteed to find the shortest path. If the goal node is available,
the algorithm would already reach the node first. Hence, we can see that the
algorithm is optimal.
Time complexity
- For a search tree of branching factor equal to ‘b’ expanding the root yields ‘b’
nodes at the 1st level.
- Again, expanding the ‘b’ nodes at the 1st level yields b2 node at the second
levels.
- If the goal is in dth level, in the worst case, goal node would be the last node in
dth level.
- We should expand (b4- 1) nodes in dth level except goal node itself. Total nodes
in dth level = b(b4 - 1 )
- Time complexity = O(bd)
Space complexity
The time complexity and space complexity for BFS would be the same as this search
will have to store all the nodes in the memory until it finishes searching the whole
tree. Hence space complexity would be O(bd+1).
Depth First Search
DFS proceeds down a single branch of the tree at a time. It expands the root node,
then the leftmost child of the root node, then the leftmost child of that node etc.
It always expands a node at the deepest level of the tree. Only when the search hits
a dead end (a partial solution which can’t be extended) does the search backtrack
and expand nodes at higher levels.
Process: Use stack to keep track of nodes.
Put the start node on the stack.
While stack is not empty
- Pop the stack
- If the top of the stack is the goal stop.
- Otherwise push the nodes connected to the top of the stack on the
stack (provided they are not already on the stack)
Properties
Completeness: Incomplete as it may get stuck down, going down an infinite branch
that doesn’t lead to a solution.
Optimality: The first solution found by the DFS may not be the shortest.
Space complexity: b as branching factor and d as tree depth level, space complexity
= O(b * d)
Time complexity: O(b * d)
Informed Search
These search strategies have problem specific knowledge apart from the problem
definition. These can find solution efficiently than uninformed search strategy by
the use of heuristics. Heuristics is a search technique that improves the efficiency
of a search process. It guides the search process in the most profitable direction by
suggesting which path to follow first when more than one is available. The
heuristics search can be categorized into different categories, as follows:
1. Best First Search
In this type of search, a node is selected for expansion based on an evaluation
function f(n). The node with the lowest evaluation function is expanded first. The
measure of heuristic, i.e., the evaluation must incorporate some estimate of the
cost of the path from a state to the closest goal state.
The heuristic algorithm may have different evaluation function. On important
parameter is the heuristic function h(n) where h(n) is the estimated cost of the
cheapest path from node n to a goal node.
The search can be further divided into two major types, as follows:
i. Greedy Best First Search
In this strategy, the node whose state is judged to be the closest to the goal state
is expanded first. It evaluates the nodes by using just the heuristic function. Hence,
in this case
Evaluation function,
f(n)=h(n) h(n)=0
if n is the goal.
One example of the heuristic function may be the straight line distance to the goal
in route finding problem.
Since this strategy prefers to take the biggest possible bite out of the remaining
cost to reach the goal, without worrying about whether this will be the best in the
long run, it is called "greedy search".
Properties:
Completeness: This algorithm can start down an infinite path and never return to
any other possibilities. Hence this algorithm is not complete.
Optimality: This algorithm looks for the immediate best choice and doesn't make a
careful analysis of the long term options. Hence it may give longer solution even is
a shorter solution exists. This algorithm is not optimal.
Time complexity: The time complexity of this algorithm is O(b), where m is the
maximum depth of the search space.
Space complexity: The space complexity of the algorithm is O(b) too since all the
nodes have to be kept in memory
A* search
This algorithm is a best first search algorithm which evaluates nodes by combining
g(n), the cost to reach the node and h(n), the cost to get from the goal. Thus the
evaluation function, f(n)=g(n)+h(n)
Here f(n) is the estimated cost of the cheapest solution through n.
Hill Climbing Algorithm
Hill climbing algorithm is a local search algorithm which continuously moves in the
direction of increasing elevation value to find the peak of the mountain or best
solution to the problem. It terminates when it reaches a peak value where no
neighbor has a higher value. Hill climbing algorithm is a technique which is used for
optimizing the mathematical problems. One of the widely discussed examples of
hill climbing algorithm is traveling salesman problem in which we need to minimize
the distance traveled by the salesman. It is also called greedy local search as it only
looks to its good immediate neighbor state and not beyond that. A node of hill
climbing algorithm has two components which are state and value. Hill climbing is
mostly used when a good heuristic is available. In this algorithm, we don't need to
maintain and handle the search tree or graph as it only keeps a single current state.
Features of hill climbing algorithm:
Following are some main features of hill climbing algorithm:
Generate and test variant: Hill climbing is the variant of generate and test
method. The generate and test method produce feedback which helps to decide
which direction to move in the search space.
Greedy approach: Hill climbing algorithm search moves in the direction which
optimizes the cost.
No backtracking: It does not backtrack the search space, as it does not
remember the previous states.
State-space landscape for hill climbing:
The state-space landscape is a graphical representation of the hill climbing
algorithm which is showing a graph between various states of algorithm and
objective function/cost.
On y-axis, we have taken the function which can be an objective function or cost
function, and state-space on the x-axis. If the function on y-axis is cost then, the
goal of search is to find the global minimum and local minimum. If the function of
y-axis is objective function, then the goal of the search is to find the global
maximum and local maximum.
Local maximum: Local maximum is a state which is better than its neighbor
states, but there is also another state which is higher than it.
Global maximum: Global maximum is the best possible state of state space
landscape. It has the highest value of objective function.
Current state: It is a state in a landscape diagram where an agent is currently
present.
Flat local maximum: It is a flat space in the landscape where all the neighbor
states of current states have the same value.
Shoulder: It is a plateau region which has an uphill edge.
Hill climbing is the simplest way to implement a hill climbing algorithm. It only
evaluates the neighbor node state at a time and selects the first one which
optimizes current cost and set it as a current state. It only checks it's one successor
state, and if it finds better than the current state, then move else be in the same
state. This algorithm has the following features:
- Less time consuming
- Less optimal solution and the solution is not guaranteed
Algorithm for hill climbing:
Step 1: Evaluate the initial state, if it is goal state then return success and stop.
Step 2: Loop until a solution is found or there is no new operator left to apply.
Step 3: Select and apply an operator to the current state.
Step 4: Check new state:
- If it is goal state, then return success and quit.
- Else if it is better than the current state then assign new state as a current
state.
- Else if not better than the current state, then return to step 2.
Step 5: Exit
Problems in hill climbing algorithm:
1. Local maximum:
A local maximum is a peak state in the landscape which is better than each of its
neighboring states, but there is another state also present which is higher than
the local maximum.
Solution: Backtracking technique can be a solution of the local maximum in state
space landscape. Create a list of the promising path so that the algorithm can
backtrack the search space and explore other paths as well.
2. Plateau: A plateau is the flat area of the search space in which all the neighbor
states of the current state contains the same value, because of this algorithm
does not find any best direction to move. A hill-climbing search might be lost in
the plateau area.
Solution: The solution for the plateau is to take big steps or very little steps
while searching, to solve the problem. Randomly select a state which is far
away from the current state so it is possible that the algorithm could find non-
plateau region.
3. Ridges: A ridge is a special form of the local maximum, It has an area which is
higher than its surrounding areas, but itself has a slope, and cannot be reached
in a single move.
Solution: With the use of bidirectional search, or by moving in different
directions, we can improve this problem.