0% found this document useful (0 votes)
37 views28 pages

Importance of Studying Algorithms

The study of algorithms is essential for IT professionals, both theoretically and practically, as it allows for the design of efficient solutions to various problems. Algorithms are defined procedures for solving problems, and their in-depth understanding helps develop analytical skills. The document presents several algorithms for calculating the greatest common divisor, illustrating the diversity of approaches and the importance of accuracy in defining inputs and outputs.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views28 pages

Importance of Studying Algorithms

The study of algorithms is essential for IT professionals, both theoretically and practically, as it allows for the design of efficient solutions to various problems. Algorithms are defined procedures for solving problems, and their in-depth understanding helps develop analytical skills. The document presents several algorithms for calculating the greatest common divisor, illustrating the diversity of approaches and the importance of accuracy in defining inputs and outputs.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter 1

Introduction

Why do we need to study algorithms? If we want to become professionals.


In computer science, there are both theoretical and practical reasons to study the
algorithms. From a practical point of view, you need to know a standard set of algorithms.
relevant from different areas of calculation. In addition, you must be able to design
new algorithms and analyze their effectiveness. From a theoretical point of view, the study of
Algorithms have been recognized as the cornerstone of computer science. David Harel, in his
a magnificent work aptly titled Algorithmics: the Spirit of Computing, declares:
Algorithmics is more than a branch of computer science. It is the soul of computer science.
in all truth, it has great importance for the majority of science,
business and technology (Algorithmics is more than a branch of computer science. It is
the core of computer science, and, in all fairness, can be said to be relevant to most of
science, business, and technology.
Even if you are not a student in a program related to computer science, there are reasons
valuable to study algorithms. To understand it well, computer programs
would not exist without algorithms. With computer applications that are becoming
now essential in all aspects of our professional and personal lives, to study
Algorithms are becoming a necessity for more people.
Another reason to study algorithms is their usefulness in developing skills.
of analysis. In short, algorithms can be considered as particular types of
solutions to problems, not answers but defined procedures to obtain
answers. Therefore, some specific algorithm design techniques
can be interpreted as problem-solving strategies that may be useful
regardless of computing considerations. Of course, the inherent precision imposed
algorithmic reflection limits the type of problems that can be solved with a
algorithm. For example, you will not find an algorithm to live a happy life or to
becoming rich and famous. On the other hand, this required precision has an important educational advantage.
Donald Knuth, un des informaticiens de premier plan dans l’histoire des algorithmes, le déclare de
the following way:
A person well-trained in computer science knows how to deal with algorithms: how to
construct them, manipulate them, analyse them. This knowledge is preparation for much
more than writing good computer programs; it is a general-purpose mental tool that will be
a definite aid to the understanding of other subjects, whether they be chemistry, linguistics,
or music, etc. The reason for this may be understood in the following way: it has often been
said that a person does not really understand something until after teaching it to someone
someone else. Actually, a person does not really understand something until after teaching it.
to a computer, i.e., expressing it as an algorithm…An attempt to formalize things as
algorithms leads to a much deeper understanding than if we simply try to comprehend
things in the traditional way. [Knu96].
We consider the concept of algorithm in Section 1.1. As examples, we use three
algorithms for the same problem: the calculation of the greatest common divisor of two numbers. It
There are several reasons for this choice. First, it addresses a familiar problem for everyone.
since elementary studies. Secondly, he emphasizes the important point that the same
A problem can often be solved by several algorithms. Typically, these algorithms
1
different in their idea, the level of sophistication and effectiveness. Thirdly, one of these
algorithms deserve to be presented first, both because of their antiquity, their power and
its durable importance. Finally, the basic procedure for calculating the greatest divisor
This allows us to highlight a critical need that every algorithm must meet.
Section 1.2 addresses the issue of algorithmic resolution of problems. Here we discuss
several important questions related to the design and analysis of algorithms. The different
aspects of the algorithmic resolution of problems ranging from problem analysis and means
from the description of an algorithm to the establishment of its correction and the analysis of its efficiency. The
this section does not include a magic recipe for designing an algorithm for a problem
arbitrary. It is a well-established fact that such a recipe does not exist. Also, the material of the Section
1.2 should be useful for organizing your work in algorithm design and analysis.
Section 1.3 is devoted to some types of problems that have proven particularly
important for the study of algorithms and their applications. Indeed, there are works
organized around such types of problems. My opinion—shared by many others—is
that an organization based on algorithm design techniques is superior. In
In any case, it is very important to know the main types of problems. Not only do they
constituent les types de problèmes les plus couramment rencontrés dans les applications réelles,
but they are also used throughout this course to demonstrate particular techniques of
algorithm design.
Section 1.4 contains a review of fundamental data structures. It is there to serve
instead of a deliberate discussion of this subject. If you need a more detailed presentation
In detail, there is a wide variety of good books on the subject, most of them being
designed according to a particular programming language.

1.1 What is an algorithm?


Although there is no universally accepted definition to describe this notion, there is a
general consensus regarding the meaning of this concept.
An algorithm is a sequence of unambiguous instructions to solve a problem, namely
to obtain a required result for any permitted input in finite time.
This definition can be illustrated by a simple diagram (figure 1.1).

Problem

Algorithm

Entries Computer Outputs

Figure 1.1 - Notion of algorithm.

The reference to the instructions in the definition implies that there is something or someone.
capable of understanding and following the given instructions. We call this a "calculator",
keeping in mind that before the invention of the digital computer, the term 'calculator'
designated a man involved in performing numerical calculations. Today of course,
Computers are electronic devices that have become essential in most
things that we do. However, note that even if the majority of algorithms are indeed

2
designed for a possible computer implementation, the notion of algorithm does not rely
essentially on such an assumption.
As examples to illustrate the notion of an algorithm, we consider in this paragraph three
different methods to solve the same problem: calculating the greatest common divisor of
two natural integers. These examples will help us illustrate several important points:
The required unambiguity for each step of an algorithm can never be compromised.
The range of inputs for which an algorithm works must be specified carefully.
The same algorithm can be represented in different ways.
Several algorithms to solve the same problem may exist.
Algorithms for the same problem can be based on different ideas and
they can solve the problem with radically different speeds.
Recall that the greatest common divisor of two positive integers that are not all zero, denoted
gcd(m,n) is defined as the largest natural integer that divides at the same time. Euclid of
Alexandria proposed an algorithm to solve this problem in one of its volumes.
Elements, more famous for its systematic presentation of geometry. In modern terms,
The Euclidean Algorithm is based on the repetitive application of the relation:
gcd(m,n) gcd(n, m mod n)

(where the remainder of the Euclidean division is) until what remains is equal to
zero. Like gcd(m, 0) m (why?), the last value is also the greatest divisor
common demetn.
Here is a more structured description of this algorithm.

Euclidean algorithm to calculate GCD(m,n)


Step 1. If sin = 0, return the value as a result and STOP; otherwise continue.
at Step 2.
Step 2. Dividing by the parent affects the value of the remainder.
Step 3. Assign the denominator value the value of the numerator. Go to Step 1.

Alternatively, we can express the same algorithm in pseudocode.

Euclidean Algorithm(m, n)
Calculating GCD(m, n) using Euclid's algorithm
Two positive integers not all null.
The greatest common divisor.
whilen 0do
r mmodn;
m n;
n r
returnm

How to know that Euclid's algorithm eventually stops? This results from
the observation that the second number of the pair becomes smaller with each iteration and cannot
become negative. Indeed, the new value in the next iteration is modn, which is always
smaller than. Thus, the value of the second number in the pair eventually becomes zero and
the algorithm stops.

3
As with many other problems, there are several algorithms to calculate the greatest
common divisor. Let's look at two other methods for this problem. The first is simply
based on the definition of the greatest common divisor as the largest integer that
divide the two numbers. Clearly, such a common divisor cannot be larger than the greatest
small of these numbers, which we denote by p = min{m, n}. Thus we start by checking if p
divide the two numbers, if yes is the answer, otherwise we decrement and try again.

Algorithm for checking consecutive integers for calculating GCD(m, n)


Step 1. Assign the value of min{m, n}
Step 2. Divisermparp. If the remainder of this division is zero, then go to Step 3,
if not go to Step 4.
Step 3. Divisernparp. If the remainder of this division is zero, return the value of p.
as a response andSTOP, otherwise continue to Step 4.
Step 4. Decrement the dep value. Go to Step 2.

Let us note that, unlike the Euclidean algorithm, this algorithm, in the form presented here,
does not work correctly when one of the two numbers is zero. This example illustrates why it
It is important to explicitly and carefully specify the domain of permitted inputs of a
algorithm.
The third procedure for calculating the greatest common divisor will be familiar to the students of
middle classes.

Procedure for intermediate classes for calculating GCD(m, n)


Step 1. Find the prime factorization.
Step 2. Find the prime factorization of den.
Step 3. Identify all the common factors from the prime factorizations.
found in Step 1 and Step 2. Sipest is a common factor appearing
pmtimes andpnonce in the document, respectively, it will be repeated min{pm, pn} fois.
Step 4. Calculate the product of all common factors and return this product
as the greatest common divisor of the numbers.

Let us now present a simple algorithm to generate consecutive prime numbers.


less than a given integer. The algorithm begins by initializing the list of prime numbers
candidates by the consecutive integers from 2 to n. Then, in the first iteration of the algorithm, we
remove all multiples of 2 from the list. Then move on to the second element of the list, which is 3,
and we eliminate all its multiples. The next remaining number in the list that is used to the
The third iteration is 5. As with 2 and 3, all multiples of 5 are removed from the list.
The algorithm continues in this way until no number can be removed from the.
list. The remaining numbers from the list are the sought prime numbers.
As an example, let's consider applying this algorithm to find the list of numbers.
primes less than or equal to n = 24.

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
2 3 5 7 9 11 13 15 17 19 21 23
2 3 5 7 11 13 17 19 23
2 3 5 7 11 13 17 19 23

4
For this example, no other passage will be necessary after the removal of the multiples of
5 any additional passage attempt will try to eliminate numbers that have already been eliminated during the iterations
previous. The remaining numbers in the list are the consecutive prime numbers less than or
equal to 24.
En général, quelle est la plus grande valeur de p dont les multiples peuvent encore restant dans la
liste ? Avant de répondre à cette question, notons d’abord que si p est un nombre dont les multiples
are eliminated in the current passage, so the first multiple that will be considered is p2because
all of its multiples less than it have already been eliminated in previous rounds. This observation
helps avoid eliminating the same number multiple times. Clearly, p2will not be greater than
n, and consequently p cannot exceed the integer part of n , denoted n We assume
In the following pseudocode, a function is available to calculate; alternatively, we
we can check the inequality p.p n as a condition for continuing the loop.

ALGORITHM Sieve(n)
Implement the Sieve of Eratosthenes
An integer n 2
A vector containing all the prime numbers less than or equal to n
forp 2round A[p] p
forp 2to n do
ifA[p] 0
j p*p
whilej I do
A[j] 0 //mark an item as deleted
j j+p
Copy the rest of the elements from A to L like the prime numbers.
i 0
forp 2tondo
ifA[p] 0
L[i] A[p];i i + 1
returnL
Thus we can now introduce the sieve of Eratosthenes in a class procedure.
intermediary to obtain a legitimate algorithm for calculating the greatest common divisor of
two positive integers.
Exercises 1.1
Problem 1. Conduct research on al-Khorezmi (or al-Khwarizmi), the man from whom it originates
the word 'algorithm'. In particular, you will be able to learn about the origins of words
"algorithm" and "algebra" have in common.
Problem 2. Design an algorithm to calculate n for a positive integer n. In addition to the
affectations and comparisons, your algorithm can only use the four operations
basic arithmetic.
Problem 3. Prove the equality gcd(m, n) gcd(n, m mod n) for every pair of positive integers m
etn.
Problem 4. What does Euclid's algorithm do for a pair of numbers where the first is greater?
smaller than the second? What is the maximum number of times this can occur during execution of
the algorithm on such inputs?

5
Problem 5.a) What is the smallest number of divisions performed by the Euclidean algorithm?
for all entries 1 m,n 10
b) What is the largest number of divisions performed by the Euclidean algorithm for all
entrées 1 m,n 10 ?

1.2. Basic principles of algorithmic problem solving

Let's start by recalling an important point made in the introduction of this chapter: We can
consider algorithms as procedural solutions to problems.
These solutions are not answers but specific instructions to obtain the answers.
It is this emphasis on precisely defined constructive procedures that distinguishes
the computer science of other disciplines. In particular, cecila distinguishes theoretical mathematics
where practitioners are typically satisfied just by the demonstration of the existence of a
solution to a problem and possibly by studying the properties of the solution.
We list and briefly discuss a series of steps that can be followed to design and
analyze an algorithm (Figure 1.2).

Understanding the problem

Decide on: the means of


calculation, exact solution or
approximate solution
data structures
design technique

Design an algorithm

Prove the correctness

Analyze the algorithm

Code the algorithm

FIGURE 1.2 - Algorithm Design and Analysis Process

1.2.1 Understanding the problem

From a practical perspective, the first thing you need to do before designing a
The algorithm is to fully understand the problem to be solved. Read carefully the
description of the problem and ask questions if you have any doubts regarding the problem,
execute some examples by hand, think about special cases and ask more questions if
necessary.

6
There are a few types of problems that are encountered quite often in applications.
IT. We will review them in the next section. If the problem you
want to solve is among them, it will be possible for you to use one of the known algorithms
to solve it. Of course, it is good to understand how such an algorithm works and
know your strengths and weaknesses, especially if you have to choose between several algorithms
existing. But often, you will not find a directly usable algorithm to solve
your problem. At that moment, you should design yours by relying on it if possible
the existing algorithms and the many algorithm design techniques that we
we will study in the course that follows. The sequence of steps indicated in this section can help you.
in this exciting but not always easy task.
An input to an algorithm specifies an instance of the problem that the algorithm solves. It is very
It is important to specify exactly the range of instances that the algorithm must process. You
Failing to do so, your algorithm will be able to function correctly for the majority of inputs.
but fails on some boundary values. Remember that a correct algorithm is not one
who walks very often, but the one who walks correctly for legitimate entries.
You should not skip this first step of the algorithmic problem-solving process.
problems because if you do, you risk doing unnecessary work.
1.2.2 Check the capabilities of the computer resources

Once you have fully understood the problem, you need to check the
capabilities of the target computing system of your algorithm. The vast majority of algorithms
currently used are still intended to be programmed on machines very similar to the
von Neumann machine an architecture of machine proposed by the famous mathematician
Hungarian-American John von Neumann. The essence of this architecture is captured by what one
called random access memory (RAM). Its main hypothesis is that instructions are
executed one after the other, one operation at a time. Consequently, the algorithms designed for
executed on such machines are called sequential algorithms.
The main hypothesis of the RAM model does not hold for new computers that
can execute operations concurrently i.e., in parallel. The algorithms that rely on
These capabilities are called parallel algorithms. The study of design techniques and
The analysis of algorithms within the RAM model will remain a cornerstone for a long time.
angular of algorithmics.
Could you have regrets about the speed and memory capacity of a computer?
What do you have? If you design an algorithm as a scientific exercise, the answer
is an unqualified no. As you will see in Section 2.1, most scientists
Computer scientists prefer to study algorithms independently of the specification of parameters.
of a particular computer. If you conceive an algorithm as a practical tool, the answer
may depend on the problem you want to solve. Even the computers that we consider
as slow as today are almost unimaginably fast. Therefore, in most
In situations, you don't have to regret that a computer is so slow for the task. It exists.
however, very complex significant problems, dealing with large volumes of
data or dealing with applications where time is crucial. In such situations, it is imperative
to pay attention to the speed and available memory on a particular computer system.
1.2.3 Choose between an approximate or exact solution
The next decision is to choose between solving the problem exactly or solving it approximatively.
approximately. In the first case, an algorithm is called an exact algorithm, in the
In the last case, an algorithm is called an approximate algorithm. Why might one choose to use an

7
approximate algorithm? First of all, there are significant problems for which most
Some instances cannot be resolved exactly; examples include the calculation of
square roots, the resolution of non-linear equations, and the evaluation of definite integrals.
Secondly, the algorithms available to exactly solve certain problems can
to be excessively slow due to the intrinsic complexity of the problem. This happens, in particular,
for several problems involving a large number of choices; you will find examples of
these difficult problems in Chapters 3 and 8. Thirdly, an approximation algorithm can
to be part of a more sophisticated algorithm that solves exactly one problem.

1.2.4 Choose the appropriate data structures

Some algorithms do not require any ingenuity to represent their inputs.


Others, on the other hand, are predesigned on ingenious data structures. Furthermore, some
the design techniques of algorithms that we will study later are closely related to
the structuring or restructuring of the data specifying the instance of the problem. There is
several years ago, a remarkable work proclaimed the fundamental importance of algorithms and
data structures for computer programming through its title [Wir76]:
Algorithms + Data Structures = Programs
In the new world of object-oriented programming, data structures remain
crucially important for both the design and analysis of algorithms. We move on to
review the basic data structures in Section 1.4.

1.2.5 Algorithm Design Techniques

Now that all the components of the algorithmic problem-solving are in place,
How can you design an algorithm to solve a given problem? This is the
question principale à laquelle ce cours cherche à répondre en vous enseignant plusieurs techniques
General design principles. What is a design technique?

A technique for designing algorithms (or strategy or paradigm) is an approach


general algorithmic problem-solving approach that is applicable to a variety of
problems arising from different areas of computer science.

Consult the summary of this course and you will see that the majority of the chapters are dedicated to
individual design techniques. They distill some key ideas that have shown their
utility in algorithm design. Studying these techniques is of paramount importance for
the following reasons:

First, they provide a guideline for the design of algorithms for new
problems, that is to say problems for which there is no satisfactory algorithm.
consequently—for using the language of a famous proverb—learning such techniques is
useful for learning to fish instead of being given fish caught by someone else. It is not
it is not true that each of these general techniques will necessarily be applicable to
each of the problems you may encounter. But taken together, they form a
powerful collection of tools that you will surely find useful in your studies and your work.

Secondly, algorithms are the cornerstone of computer science. Every science


is interested in the classification of its main subject and computer science is no exception.
The design techniques of algorithms make it possible to classify algorithms.
following an underlying design idea; therefore they can serve as a means
natural both to categorize and study algorithms.

8
1.2.6 Methods of Describing Algorithms

Once you have designed an algorithm, you need to describe it in a certain way.
In Section 1.1, to give you an example, we described the Euclidean algorithm.
literally (in a free form and also in a step-by-step form) and in pseudocode. This
Here are the two options that are most commonly used for algorithm specification.
Using a natural language has an obvious appeal; however, the inherent ambiguity of any
which natural language makes the succinct and clear description of algorithms surprisingly difficult.
Nevertheless, being able to do it is an important skill that you should develop.
in your algorithm learning process.
Unpseudocode is a mixture of a natural language and the constructions of a language of
programming. A pseudocode is usually more precise than a natural language and its use
often produces more concise descriptions of the algorithms. What is surprising here is that
computer scientists have never agreed on a single form of pseudocode, leaving each
author of work the latitude to define their own dialect. Fortunately, these dialects are quite
closer to each other than anyone familiar with a modern programming language
will be able to understand them all.
The dialect we have adopted in this course was chosen to cause the least amount of trouble.
possible for readers. For the sake of simplicity, we omit variable declarations and
let's use indentations to show the scope of instructions such as for, if, and while. We
let's use the arrow pour l’opération d’affectation et deux slash // pour les commentaires.
In the early days of computing, the dominant support for the specification of algorithms
was flowcharts, a method of expressing algorithms through a collection of figures
connected geometries containing the descriptions of the steps of the algorithm. This technique of
representation has proven to be inconvenient for everything but very simple algorithms; from our
days, it can only be found in old books on algorithms.
The state of the art in computer science has not yet reached a point where the description of an algorithm, in
natural language or pseudocode can be input directly into the computer. Indeed, this
description needs to be converted into a computer program written in a language of
given programming. We can consider such a computer program as another
way of specifying algorithms, although it is better to consider it as
the implementation of the algorithm.

1.2.7 Verification of the correctness of an algorithm

Once an algorithm has been specified, you must show its correctness. That is to say, you
you must show that the algorithm produces the desired result for all legitimate inputs in
a finished time. For example, the correction of Euclid's algorithm for calculating the greatest
common divisors of two integers depend on the correctness of the equality
gcd(m,n) gcd(n, mmodn) (which in turn must be proven; see Problem 6 in the exercises),
the simple observation that the second number becomes smaller and smaller with each iteration of
the algorithm, and the fact that the algorithm stops when the second member is zero.
For some algorithms, a proof of correctness is fairly easy; for others, it can be
quite complex. A simple technique to demonstrate the correctness of an algorithm consists of
use mathematical induction because the iterations of an algorithm provide a sequence
natural steps necessary for such evidence. It may be interesting to note that while
that monitoring the performance of an algorithm for a few specific inputs can be a
very valuable activity, he cannot prove the correctness of the algorithm in a conclusive manner.

9
But in order for an algorithm to be incorrect, you only need one instance of it.
input for which the algorithm fails. If the algorithm is found to be incorrect, you must either the
rethink with the same decisions regarding data structures, the technique of
conception or in the extreme case reconsider one or more of these decisions.
The notion of correctness for approximation algorithms is less obvious than the
exact algorithms. For approximation algorithms, it is often preferred to be able to show
that the error produced by the algorithm does not exceed a predefined limit.

1.2.8 Analysis of algorithms

We usually want our algorithms to have several qualities. After the correction,
The quality that is by far the most important is efficiency. Indeed, we distinguish between two types of efficiency.
algorithms: time efficiency and memory efficiency. Time efficiency indicates with
what speed the algorithm executes. Memory efficiency indicates how much memory
the algorithm requests. A general framework and specific techniques for analyzing effectiveness
Algorithms are given in Chapter 2.
Another desirable characteristic is simplicity. Unlike efficiency, which can be
clearly defined and studied with mathematical rigor, simplicity, like beauty, is
finds to a considerable degree in the eyes of the owner. For example, several people
they will accept that the Euclidean algorithm is simpler than the elementary procedure of calculation
greatest common divisor, but it is unclear if the Euclidean algorithm is simpler than
the algorithm for testing consecutive integers. Moreover, simplicity is an important characteristic.
algorithms that need to be tried to obtain. Why? Because the simpler the algorithms
the easier it is to understand them, the easier it is to program them. Therefore, the
resulting programs generally contain few bugs. There is also the aspect
undeniable aesthetics of simplicity. Unfortunately, it is not always easy to know in
In what case a prudent compromise must be made.
Another desirable characteristic of an algorithm is generality. There are indeed two aspects.
here: the generality of the problem that the algorithm solves and the range of inputs it accepts. In
In the first case, let us note that it is sometimes easy to conceive an algorithm for a given problem.
des termes plus généraux. Considérons par exemple le problème de la détermination si deux entiers
are first among waters. It is easier to conceive an algorithm for a problem that is more
general calculation of the greatest common divisor of two integers and solve the posed problem in
checking whether the GCD is equal to one or not. However, there are situations where designing a
a more general algorithm is useless or difficult or even impossible. For example, it is useless to
to sort a list of numbers to find its median, which is its n/2th smallest element. For
give another example, the standard formula for the roots of a quadratic equation cannot
to be generalized to handle polynomials of arbitrary degree.
As with the range of appetizers, your main concern is to design an algorithm.
which deals with a range of inputs that is natural for the problem at hand. For example, exclude
the integers equal to 1 as possible inputs for the greatest common divisor algorithm will be
naturally not very natural. On the other hand, although the classic formula for the roots of an equation
Quadratic holds for complex coefficients, we normally should not implement it at
this degree of generality unless this capability is explicitly requested.
If you are not satisfied with the effectiveness of the algorithm, the simplicity or the generality, you must
go back and redesign the algorithm. In fact, even if your evaluation is positive, it is
toujours inutile de chercher d’autres solutions algorithmiques. Rappelons les trois algorithmes
different from the previous section for calculating the greatest common divisor; generally
you are not going to expect to have the best algorithm on the first try. In the best case,

10
you are going to try to refine an algorithm that you already have. For example, we have performed
several improvements to our implementation of the Sieve of Eratosthenes compared to the version
initial data given in Section 1.1. You would do better to keep the following observation in mind
Antoine de Saint-Exupéry, the French writer, pilot, and aircraft designer: 'A designer knows
that it has reached perfection not when there is nothing more to add, but when there is no longer
plus something to remove.
1.2.9 Coding of algorithms

Most algorithms are intended to ultimately be implemented as programs.


Computing. Programming an algorithm presents both a peril and an opportunity. The peril is
trouve dans la possibilité de rendre le passage d’un algorithme à un programme soit incorrect soit
very ineffective. Some programmers firmly believe that even if the correction of a
the algorithm is established with all mathematical rigor, the program cannot be considered
as correct. They developed special techniques to provide such evidence, but the
the power of these formal verification techniques has been limited so far to very small
programs. As another practical consideration, the validity of the programs is still established at
means of testing. Testing software programs is more of an art than a science, but it
does not mean that there is nothing to learn from this.

Let us also note that throughout our course, we assume that the inputs of the algorithms
belong to the specified sets and therefore do not require verification. When
you will implement algorithms as programs to be used in
real applications, you will need to plan for such checks.
Exercises 1.2
Problem 1. Old World Puzzle. A shepherd is on one bank with a wolf, a goat, and
a cabbage head. He must help each of the three protagonists cross to the other side of the river.
means of a boat. The boat being small to transport them all, on each crossing of the
river, he can only take one of the three protagonists. One cannot leave the goat and the cabbage.
(the wolf and the goat) alone on a bank. How should the shepherd make them cross?
three protagonists under the indicated constraints. (Note: The shepherd is a vegetarian but does not like
not the cabbage and therefore cannot eat either the goat or the cabbage to help him solve the
problem. And it goes without saying that the wolf is a protected species).

Problem 2. New World Puzzle. There are four people who want to cross a bridge; they
They all start from the same side. You have 17 minutes to get them all across to the other side.
by the bridge. It is night and they have a flashlight. A maximum of two people can
cross the bridge at the same time. Each part that crosses, whether it's one or two people,
must have the flashlight with her. The flashlight must be brought in both directions; it must not
cannot be thrown away, for example. Person 1 takes 1 minute to cross the bridge, person 2
Person 1 takes 2 minutes, person 3 takes 5 minutes, and person 4 takes 10 minutes. A pair must
walk together at the pace of the slowest person. For example, if person 1 and the
Person 4 must cross first, 10 minutes will have passed when they reach the other side.
the bridge. If person 4 brings the flashlight, a total of 20 minutes will have passed and you
You have failed the mission.
Problem 3. Which of the following formulas can be considered as an algorithm?
for the calculation of the surface of a triangle whose side lengths are positive numbers a, b
and what?
a) S p ( p  a)(p b)(p c ) , where p (a b c) / 2
1
b) S bcsinA , where A is the angle between sides b and c
2

11
There are a few types of problems that are encountered quite often in applications.
IT. We will review them in the next section. If the problem you
want to solve is among them, it will be possible for you to use one of the known algorithms
to solve it. Of course, it is good to understand how such an algorithm works and
know your strengths and weaknesses, especially if you have to choose between several algorithms
existing. But often, you will not find a directly usable algorithm to solve
your problem. At that moment, you should design yours by relying on it if possible
the existing algorithms and the many algorithm design techniques that we
we will study in the course that follows. The sequence of steps indicated in this section can help you.
in this exciting but not always easy task.
An input to an algorithm specifies an instance of the problem that the algorithm solves. It is very
It is important to specify exactly the range of instances that the algorithm must process. You
Failing to do so, your algorithm will be able to function correctly for the majority of inputs.
but fails on some boundary values. Remember that a correct algorithm is not one
who walks very often, but the one who walks correctly for legitimate entries.
You should not skip this first step of the algorithmic problem-solving process.
problems because if you do, you risk doing unnecessary work.
1.2.2 Check the capabilities of the computer resources

Once you have fully understood the problem, you need to check the
capabilities of the target computing system of your algorithm. The vast majority of algorithms
currently used are still intended to be programmed on machines very similar to the
von Neumann machine an architecture of machine proposed by the famous mathematician
Hungarian-American John von Neumann. The essence of this architecture is captured by what one
called random access memory (RAM). Its main hypothesis is that instructions are
executed one after the other, one operation at a time. Consequently, the algorithms designed for
executed on such machines are called sequential algorithms.
The main hypothesis of the RAM model does not hold for new computers that
can execute operations concurrently i.e., in parallel. The algorithms that rely on
These capabilities are called parallel algorithms. The study of design techniques and
The analysis of algorithms within the RAM model will remain a cornerstone for a long time.
angular of algorithmics.
Could you have regrets about the speed and memory capacity of a computer?
What do you have? If you design an algorithm as a scientific exercise, the answer
is an unqualified no. As you will see in Section 2.1, most scientists
Computer scientists prefer to study algorithms independently of the specification of parameters.
of a particular computer. If you conceive an algorithm as a practical tool, the answer
may depend on the problem you want to solve. Even the computers that we consider
as slow as today are almost unimaginably fast. Therefore, in most
In situations, you don't have to regret that a computer is so slow for the task. It exists.
however, very complex significant problems, dealing with large volumes of
data or dealing with applications where time is crucial. In such situations, it is imperative
to pay attention to the speed and available memory on a particular computer system.
1.2.3 Choose between an approximate or exact solution
The next decision is to choose between solving the problem exactly or solving it approximatively.
approximately. In the first case, an algorithm is called an exact algorithm, in the
In the last case, an algorithm is called an approximate algorithm. Why might one choose to use an

7
alphabet, strings of characters and larger records similar to those used
by the faculties regarding their students, the bookstores regarding their works, and the businesses
about their employees. In the case of recordings, we need to choose an item.
information to guide the sorting. For example, we can choose to sort the records of the
students in alphabetical order of names, by registration numbers or by overall grades
students. Such a specially chosen piece of information is called a key.
Why might we need a sorted list? Well, sorting makes several questions
regarding the easy-to-answer lists. The most important of these questions is research; it is
for this reason dictionaries, telephone directories, class lists and so on
are sorted. You will see other examples of the usefulness of sorted lists in Section 6.1. In the same
sorting is used as an auxiliary step in several important algorithms in
other fields, such as geometric algorithms.
At the moment, computer scientists have discovered dozens of different sorting algorithms.
Indeed, inventing a new sorting algorithm has been compared to the invention of the proverbial wheel. I am
however happy to reveal that the search for better sorting algorithms continues. This
Perseverance is admirable given the following facts. On one hand, there are very few good algorithms.
of sorting that sorts an arbitrary vector of size n with approximately nlogn2 comparisons. On the other hand,
no algorithm that sorts by key comparisons (as opposed to small comparisons)
key elements) cannot substantially do better than that.
There are reasons for this embarrassing algorithmic wealth in the field of sorting. Although
some algorithms are indeed better than others, there is no algorithm that will be
best solution among all solutions. Some algorithms are simple but relatively
slower while others are faster but more complex. Some perform better on a
randomly ordered entries while others perform better on almost sorted lists.
Some are suitable for lists residing in fast memory while others may
to be suitable for sorting large files stored on disk and so on.
Two properties of sorting algorithms deserve special mention. A sorting algorithm is
it preserves the relative order of two equal elements from the input list in the sorted list.
In other words, if a list contains two equal elements in the positions... so then
in the sorted list they must be found respectively in positions i' and j' such that i' j’.
This property can be desirable if, for example, we have a list of students sorted in order
alphabetical and we want to sort them according to the student’s GPA: a stable algorithm will produce a
List in which students with the same GPA will be sorted alphabetically.
Generally speaking, algorithms that can exchange remote keys are not stable.
but are usually quicker.
The second important property of a sorting algorithm is the amount of additional memory.
that the algorithm requires. A sorting algorithm is said to be internal if it does not require memory.
additional in addition to that occupied by the waiting list, possibly except for a few
memory units. There are important sorting algorithms that are internal and others that are not.
are not.

1.3.2 Research Problems


The search problem involves searching for a given value, called the search key, in
a given set (or a multiset that allows multiple elements to have the same value).
There are many search algorithms. These algorithms range from searching
sequential to binary search which is a search technique particularly
effective but limited and algorithms based on the representation of the underlying set under
a different form more suitable for research. These latest algorithms are of a

13
particular importance for real applications because they are essential for the
storage and retrieval of information in databases.
Pour la recherche aussi, il n’y a pas un seul algorithme qui s’adapte mieux à toutes lessituations.
Some algorithms run faster than others but require more memory, some
are faster but applicable only to sorted vectors, etc. Unlike the
sorting algorithms, there is no stability issue, but different topics emerge.
Specifically, in applications where the underlying data may change frequently
in relation to the number of searches, the research must be considered in conjunction with two
other operations: insertion and deletion of an element in the data set. In such
situations, the data structures and algorithms will need to be chosen to reduce the balance
between the requirements of each operation. Also, the organization of large datasets for
An effective search presents particular challenges with significant implications for
real applications.

1.3.3 String processing issues


In recent years, the rapid proliferation of applications handling non-numeric data has
intensified the interest of researchers and practitioners in computer science in algorithms
processing of strings. A string is a sequence of characters belonging to an alphabet. The
chains of particular interest are alphanumeric chains comprising letters,
names and special characters; bit strings comprising zeros and ones and
genetic chains that can be modeled by strings belonging to
the alphabet {A, C, G, T} of genetic characters. It should be noted, however, that the
String processing algorithms have been important for computing for a long time.
conjunction with programming languages and compilation issues.
A particular problem has attracted special attention from researchers. They have called it the
string matching recognition. Several algorithms that exploit the nature
special for this type of research has been invented. We introduce a very simple algorithm to
Chapter 3 and let's discuss two other algorithms based on the remarkable idea of R. Boyer and J.
Moore in Chapter 7.

1.3.4 Problems on graphs


One of the oldest and most interesting areas in algorithms is that of problems
on graphs. Informally, a graph can be considered as a collection of points
called vertices, some of them connected by line segments called edges. Graphs
are an interesting topic to study both for theoretical and practical reasons. Graphs
can be used to model a wide variety of real-world applications, including networks
of transport and communication, project planning and games. An application
An interesting recent estimate is the diameter of the Web, which is the maximum number of links that
one must follow to reach a web page from another by taking the shortest path
directly between them.

The fundamental algorithms on graphs include traversal algorithms.


graphs, shortest path algorithms, topological sorting for graphs with edges
oriented. Fortunately, these algorithms can be considered as illustrations for the
general design techniques, therefore we will find them in the chapters
correspondents of this course.

Some problems on graphs are very difficult. The most well-known examples are the
traveling salesman problem and the graph coloring problem. The problem of
The traveling salesman problem is the issue of finding the shortest path that passes through n.
cities by visiting each city exactly once. In addition to the obvious applications
14
regarding road planning, we are recording modern applications such as manufacturing
VLSI chips, X-ray crystallography, and genetic engineering. The coloring problem.
Graph coloring is the problem of assigning the smallest number of colors to the vertices.
of a graph such that two adjacent vertices are not the same color. This problem arises
in several applications such as event scheduling: if the events are
represented by vertices that are connected by an arc if and only if the events
correspondents cannot be scheduled at the same time, a solution to the problem of
Graph coloring can produce optimal scheduling.

1.3.5 Combinatorial problems


From a more abstract perspective, the traveling salesman problem and the problem of
Graph coloring is an example of combinatorial problems. These are problems that
ask (explicitly or implicitly) to find a combinatorial object (a permutation,
a combination or a subset) that satisfies certain constraints and has a certain property
desired (e.g. maximize a value or minimize a cost).
Strictly speaking, combinatorial problems are the hardest problems in
computer science, both from theoretical and practical points of view. Their difficulty stems from the facts
next. First, the number of combinatorial objects typically grows extremely fast with the
size of the problem, reaching unimaginable magnitudes even for instances of size
moderate. Secondly, there are no known algorithms to solve exactly several such
problems in an acceptable time. Furthermore, several computer specialists believe that
Such algorithms do not exist. This conjecture has neither been confirmed nor refuted, and it remains
the most important unresolved topic in theoretical computer science. We discuss this topic in more detail
in section 11.3.
Some combinatorial problems can be solved by efficient algorithms, but they will be
considered as exceptions to the rule. The problem of the shortest path mentioned
previously is among these exceptions.

1.3.6 Geometric Problems


Geometric algorithms deal with geometric objects such as points, lines and
polygons. The ancient Greeks were very interested in developing procedures for
solve a variety of geometric problems including simple construction problems
geometric shapes (triangle, circles, etc. with a non-graduated ruler and a compass. Then for
About 2000 years ago, the great interest in geometric algorithms disappeared, only to be resurrected.
in the age of computers, no longer with rulers and compasses, just bits, bytes and the
good old human ingenuity. Of course, people today are interested in the
geometric algorithms for completely different applications, such as graphics,
robotics and tomography.
We will study algorithms for only two classic problems in geometry.
numeric: the nearest pair problem and the convex hull problem. The problem
the closest pair problem explains itself: given n points in the plane,
find the closest pair among them. The convex hull problem
request to find the smallest convex polygon that will contain all the points of a
given set.

1.3.7 Digital Issues


Numerical problems, another vast field of applications, are the problems that
involve mathematical objects of a continuous nature; solve equations and systems
of equations, calculate finite integrals, evaluate functions and so on. The majority of such

15
Mathematical problems can only be solved approximately. Another main
The difficulty lies in the fact that such problems typically require the manipulation of numbers.
real numbers, which can only be represented in the machine approximately. Moreover, a
large number of arithmetic operations performed on represented numbers
approximately can lead to a buildup of rounding errors to a point where they
can drastically affect an output produced by an apparently fair algorithm.

Several sophisticated algorithms have been developed over the years in this field, and they
continuent de jouer un rôle critique dans plusieurs applications scientifiques et d’ingénierie. Maisau
Over the past 25 years, the computer industry has shifted its focus to the field
management applications. These new applications primarily require algorithms.
for the storage, retrieval, and transmission of information across networks, and their
presentation to users. As a consequence of this revolutionary change, the analysis
digital has lost its former dominant position both in the industry and in programs
computing. However, it is always important for any beginner in computing to have at
less a rudimentary idea about scientific algorithms.

Exercises 1.3

Problem 1. Consider the sorting algorithm that sorts a vector by counting for each of its
elements, the number of elements that are smaller than it and then uses this information to
place the element in its final position in the sorted vector:

ALGORITHM Comparison Counting Sort(A[0..n–1])


Sort with vector by comparing accounts
A vector A[0..n–1] of sortable values
A vector S[0..n–1] made up of the elements of A sorted in ascending order.
fori 0ton–1do
Count[i] 0
fori 0ton–2do
forge i + 1ton–1do
ifA[i] A[j]
Count[j] Count[j] + 1
elseCount[i] Count[i] + 1
hole 0ton–1do
S[Count[i]] A[i]
returnS

a) Apply this algorithm to sort the list 60, 35, 81, 98, 14, 47.
b) Is this algorithm stable?
Is it internal?

Problem 2. Name the search algorithms that you already know. Give a
succinct description of each of these algorithms in French. (If you do not know of such
algorithms, take the opportunity to design one)

Problem 3. Design a simple algorithm for the string matching problem.

Problem 4. The bridges of Königsberg. The puzzle of the bridges of Königsberg is universally
accepted as a problem that gave rise to graph theory. It was solved by the great
Swiss mathematician Leonard Euler (1707-1783). The problem asked whether one could,

16
in a single walk, cross each of the seven bridges of the city of Königsberg exactly once
times and return to the starting point. Below is a sketch of the river with its islands and its seven bridges.

a) Formulate this problem as a graph problem.


b) Does this problem have a solution? If you believe so, draw such a walk; if you do not
do not believe, explain why and indicate the smallest number of new bridges that
would be required to make such a walk possible.
Problème [Link] jeu Ecossais. Un siècle après la découverte d'Euler, un autre puzzle célèbre—celui-
it was invented by a renowned Irish mathematician Sir William Hamilton (1805-1865)—a
was published under the name of the Scottish game. The game was played on a circular wooden board on which
the following graph has been cut out:

Find a Hamiltonian circuit, a path that visits all the vertices of the graph exactly once.
times before returning to the starting point, for this graph.
Problem 6. We consider the following map:

b
a

c d

e f

a) Explain how you can use the graph coloring problem to color.
the map so that two adjacent regions are not colored the same color.
b) Use the answer from question (a) to color the map with the smallest number of
colors.
Problem 7. Design an algorithm for the following problem: given a set of n
Points in a Cartesian plane, determine if all these points are located on the same circumference.
1.4 The fundamental data structures
As the majority of interest algorithms operate on data, special means
Organizing data plays a critical role in the design and analysis of algorithms.

17
Data structure can be defined as a particular way of organizing items of
related data. The nature of the data items is dictated by the problem at hand; they
can range from elementary data types to data structures. There is very little
data structures that have proven to be particularly interesting for algorithms
informatics. As you are probably familiar with all or almost all of these data structures,
A quick review is provided here.

1.4.1 Linear Data Structures

The most important linear data structures are vectors and linked lists. A
A vector is a sequence of elements of the same type of data that are stored in boxes.
consecutive to the computer's memory and made accessible by specifying their value
index in the vector (Figure 1.3).
In most cases, the index is an integer between 0 and n-1 or between 1 and n. Some
programming languages allow an index that can vary between two e

Item[0] Item[1] … Item[n-1]

FIGURE 1.3–A vector of elements.


One can access each and every element of a vector in constant time regardless of the location.
as the element in question is located. This property positively distinguishes vectors from lists.
chained. It is also accepted that each element of an array occupies the same amount of memory.
Vectors are used in a variety of other data structures. Among them, we find the string.
a string of characters ending with a special character indicating the end of the string. Strings
composed of zeroes and ones are called binary strings. Strings are essential
for the processing of textual data, the definition of compilation languages and compilation
des programmes écrits dans ces langages, et l’étude de modèles de calcul abstraits. Les opérations
what we usually do on the channels is different from what we do on others
vectors (e.g., number vectors). They include the calculation of the length of a string, the
comparison of strings to determine which precedes the other in lexicographic order and the
concatenation of two strings.
A linked list is a sequence of zero or more elements called nodes, each containing two
types of information: data and several links called pointers to other nodes of the
linked list. (A special pointer called nil is used to indicate the absence of a node
successor.) In a simple linked list, each node except the last contains only one
pointer to the next element of the list (Figure 1.4).

Item0 Item1 … Item n-1

FIGURE 1.4–A simple linked list of n elements.


To access a particular node of a linked list, we start with the first cell of the
We list and traverse the pointer chain until the desired node is reached. Thus, the time
necessary to access an element in a simple list, unlike vectors, depends on
the position of the element in the list. On the positive side, linked lists do not require a
réservation préalable de la mémoire, les insertions et les suppressions peuvent être effectuées assez
efficiently in a linked list by repositioning some appropriate pointers.
We can leverage the flexibility of linked list structures in various ways. By
exemple, il est souvent commode de débuter une liste chaînée par un nœud spécial appelésentinelle
18
Header. This node often contains information about the list such as its length.
current; it can also contain, in addition to a pointer to the first element, a pointer to the
last item of the list.
An extension is the structure called the doubly linked list, in which each node,
except for the first and last, contains pointers to both its successor and its
predecessor (Figure 1.5).

Item0 Item1 … Item n-1

Figure 1.5 - A doubly linked list of elements.

The vector and the linked list are two main choices for representing a data structure more
abstract called linear list or simple list. A list is a finite sequence of data elements,
i.e., a set of data elements arranged in a certain order. The basic operations
performed on this data structure are searching, inserting, and deleting a
element.
Two special types of lists, stacks and queues, are particularly important.

DEFINITION. A linked list in which all insertions and deletions


is always at the end of the list. This end is commonly called the top of the stack.
A stack is always represented vertically. When elements are added (stacked)
in a stack and removed from the stack (pop), this structure operates according to the 'last-
"first arrived - first out" or "last in - first out" (LIFO). Stacks have
a multitude of applications; in particular, they are essential for implementing some
recursive algorithms.

Check that the stack is empty


Stack an element
Remove an item

DEFINITION.

A queue is a list in which all insertions are made at the end of the list and all
suppressions at the top of the list. Therefore, a queue operates according to the "first-
first-come-first-served" or "first-in-first-out" (FIFO). The queues
also have important applications including several algorithms for the problems of
graphs.

Vérifier que la queue est vide


Insert an element into the queue
Remove an item from the queue
Several important applications require the selection of an element with the highest priority
among a dynamically changing set of candidates. A data structure that can
Meeting the needs of such applications is called a priority queue. A priority queue
is a collection of elements belonging to a totally ordered set. The main
operations on a priority queue are finding its largest element, deleting the

19
largest element and the addition of a new element. Of course, a priority queue must be
implemented in such a way that these last two operations produce another priority queue.
A direct implementation of this structure can be based on a vector or on a vector.
ordered, but none of these options produces the most efficient solution. A better
implementation of a priority queue is based on an ingenious data structure called the
tas(heap).
1.4.2 Graphs
DEFINITION. A graph is a pair G = (V, E) where V is a non-empty finite set of objects.
called vertices or nodes and E a set of unordered pairs of vertices called edges. These
vertex pairs are not ordered, meaning that the pair of vertices (u, v) is identical to
The pair (v, u) means that the vertices u and v are adjacent and they are connected by an undirected arc (u,
v). The vertices u and v are called the endpoints of the arc (u, v) and it is said that u and v are incident to
cet arc; on dit également que l’arc (u, v) est incident à ses extrémités.
If a pair of vertices (u, u) is not equivalent to the pair (v, u), we say that the arc (u, v) is directed to
starting from the vertex called tail to the vertex called head. It is also said that the arc (u, v) exits from the
summit and enters the summit. A graph where all edges are directed is called a directed graph.
oriented. Directed graphs are also called digraphs.
It is convenient to label the vertices of a graph or a directed graph with letters or integers.
or if the application recommends it, strings (figure 1.6). The graph in figure
1.6a six vertices and seven arcs:
V a,b,c,d,e,f , E (a,c), (a,d), (b,c), (b,f), (c,e), (d,e), (e,f) .
The digraph of figure 1.6b has six vertices and eight arcs.
V a,b,c,d,e,f , E (a,c), (b,c), (b,f), (c,e), (d,a), (d,e), (e,c), (e,f) .

a c b a c b

d e f d e f

Our definition of a graph does not prohibit loops, or edges connecting vertices to themselves.
Unless stated otherwise, we will consider graphs without loops. As per our definition
forbids multiple edges between the same vertices of a large undirected graph, we have the inequality
next for the number of possible arcs E in an undirected graph having V peaks and not
without loops:
0 E V  V 1 /  2.
A graph in which every vertex is connected to every other vertex by an edge is said to be complete.
A standard notation to denote a complete graph of V summits is K V . A graph in
which only a few arcs are missing is said to be dense. An arc having very few arcs compared to the
the number of its vertices is said to be sparse. The fact that we are working with a dense or sparse graph

20
can influence the way it is represented, and consequently the execution time of the algorithm in
conception or used.
Graph representation. Graphs for computer algorithms can be
represented in two main ways: the adjacency matrix and the adjacency lists. The matrix
The adjacency of a graph with n vertices is a binary matrix of order n having one row and one
column for the graph vertex, in which A[i,j] 1 if there is an arc between the vertices i and j,
andA[i,j] 0 otherwise. For example, the adjacency matrix of the graph in figure 1.6a is given at the
Figure 1.7a. It should be noted that the adjacency matrix of an undirected graph is always symmetric,
that is to say thatA[i,j]
(why?)A[j,i], 0 i,j n 1

The adjacency lists of a graph or a directed graph consist of a collection of linked lists,
one for each vertex, containing all the adjacent vertices to the vertex in the list (i.e. all the
vertices connected to it by an edge). Usually, such lists start with a sentinel
identifier a summit for which the list is compiled. For example, Figure 1.7b represents the
graph of Figure 1.6a using its adjacency lists. In other words, the adjacency lists
indicating the columns of the adjacency matrix which, for a given vertex, contain ones.

a b c d e f a c d
a 0 0 1 1 0 0
b c f
b 0 0 1 0 0 1
c a b e
c 1 1 0 0 1 0
d 1 0 0 0 1 0
d a e
e 0 0 1 1 0 1 e c d f
f 0 1 0 0 1 0 f b e

(a) (b)

FIGURE 1.7– (a) Adjacency matrix and (b) adjacency lists of the graph in figure 1.6a
If the graph is sparse, the representation in the form of adjacency lists may use less
of space that the representation in the form of an adjacency matrix despite memory
additional used by the pointers of linked lists; the situation is exactly the opposite
for dense graphs. In general, the choice of the most convenient representation depends on the
nature of the problem, the algorithm used to solve it, and possibly the type of graph
entry (dense or hollow).
Weighted graphs. A graph (or digraph) is a graph (or digraph) in which each arc is
affected by numerical weights. These numbers are called weights or costs. An interest in
These graphs are motivated by numerous real-world applications, such as finding the shortest path.
between two points in a transport or telecommunications network or the traveling salesman problem
of commerce mentioned above.

a b c d
a 5 b a 5 1 a b,5 c,1
b 5 7 0 b a,5 c,7 d,4
1 4
7 c 1 7 2 c a,1 b,7 d,2
c d d 4 2
d b,4 c,2
2
(a) (b) (c)

21
FIGURE 1.8– (a) Graphe pondéré. (b) Sa matrice d’adjacence. (c) Ses listes d’adjacence.
The two main representations of graphs can be easily adapted for
correspond to weighted graphs. If a weighted graph is represented in matrix form
of adjacency, then the elementA[i,j] will simply contain the weight of the arc connecting the summit to the
such an arc exists and is a special symbol, for example otherwise. Such a matrix is
called a weighted matrix or a cost matrix. The adjacency lists of a weighted graph
must include in their nodes not only the names of the adjacent nodes but also the weight
of the corresponding arch.

Paths and cycles. Among the interesting properties of graphs, two are important for a
a large number of applications: connectivity and cyclicity. Both are based on the notion of
A path from one vertex to another in a graph G can be defined as the sequence of
all adjacent vertices (connected by an arc) starting with u and ending with v. if all the
The vertices of a path are distinct, the path is called simple. The length of a path is the
total number of vertices contained in the sequence defining the shortest path, which is identical to
number of arcs contained in the path.
A directed path is a sequence of vertices in which each pair of vertices is connected by
a directed arc connecting the vertex listed first to the vertex listed second.
A graph is said to be connected if there exists a path going from u to v for every pair of vertices u and v.
Informally, this property means that if we create a model of a connected graph in
connecting spheres representing the sums of the graph with chains representing the arcs, we
will have a part. If a graph is not connected, such a model will consist of several parts.
connected components of the graph. Formally, a connected component is
a maximal (non-extendable by the inclusion of an extra vertex) subgraph of a graph.
For example, the graphs of Figures 1.6a and 1.8b are connected, while the graph of Figure 1.9 has two
connected components with the vertices {a, b, c, d, e} and {f, g, h, i}, respectively.

a f

b c e g h

a i

FIGURE 1.9–A disconnected graph

Graphs with multiple connected components do not occur in real applications. A


graph representing a highway system connecting several countries of the European Union will be a
example (why?).
It is important to know for several applications whether a given graph includes or
non cycles. An uncycle is a path of positive length starting and ending with a
same vertex and not crossing the same vertex multiple times. A graph that does not contain any
cycle est ditacyclique. Nous étudions les graphes acycliques dans la section suivante.

22
1.4.3 The trees
A tree (or more precisely a free tree) is a connected acyclic graph (Figure 1.10a). A
A graph that has no cycle but is not necessarily connected is called a forest.
1.10b).
Trees have several important properties that graphs do not have. In particular, the number
The number of edges in a tree is always less than the number of its vertices:

E V 1.

As shown in the graph of figure 1.9, this property is necessary but not sufficient for
that a graph is a tree. However, for connected graphs it is sufficient and thus provides a
A convenient way to determine if a connected graph has a cycle.

a b a b h

c d c d e i

f g f g j

(a) (b)
{"a":"A tree.","b":"A forest."}

Trees with roots. Another important property of trees is the fact that for two nodes in
A tree has exactly one simple path from one of its nodes to another.
property allows the choice of an arbitrary vertex in a tree and considers it as the
root. A tree with a root is always represented by placing its root at the top (level 0 of
the tree), the vertices adjacent to the root below (level 1), the vertices two arcs further from
the root below this (level 2), and so on. Figure 1.11 presents such a
transformation of a free tree into a rooted tree.

Trees with roots play an important role in computer science, a role more important than that
free trees. Indeed, to be brief, they are often considered as simple trees.
obvious applications of trees are for describing hierarchies, starting from files of
dictionaries to company organizational charts. There are many obvious applications; such as
that the implementation of dictionaries, the efficient storage of large data sets and the
data encoding. Trees are also useful for analyzing recursive algorithms. Finally
this largely incomplete list of the applications of trees, we could mention the trees
state-space (state-space trees) that underlie two important design techniques
algorithms: backtracking and branch-and-bound.
For each vertex of a tree T, all the vertices on the simple path going from the root to this one.
peaks are called ancestors of. The peak itself is often considered as its
proper ancestor; the set of ancestors excluding the top itself are considered as

23
clean ancestors. If (u, v) is the last edge of the simple path from the root to a vertex v
(etu v), you are called theparent nodes have the same parent are
called cousins (siblings). A vertex without edges is called a leaf; a vertex with
less a son is called parental. All the vertices for which a vertex is an ancestor are
called descendants dev; the own descendants exclude the summit itself. All the
descendants of a vertex having all the arcs connecting them to a subtree of T take their root at
this vertex. Thus, for the tree in Figure 1.11b, the root of the tree is; the vertices d, g, f, h
are leaves, while the peaks are parental, the peaks of the subtree of
racinebsont {b,c,g,h,i}
The depth of a tree is the length of the simple path from the root to v. The height
The height of a tree is the length of the longest simple path from the root to a leaf. Thus, if
we count the levels of a tree starting with 0 for the root level, the
The depth of a node is simply its level in the tree, and the height of the tree is the
maximum level of its peaks.

i d a

c b a e b d e

h g c g f
f

h I

(a) (b)
{"a":"Free tree.","b":"Its transformation into a rooted tree."}

Arbres ordonnés. Unarbre ordonnéest un arbre avec racine dans lequel tous les fils de chaque
nodes are ordered. It is convenient to assume that in the diagram of a tree, all the branches
are arranged from left to right. A binary tree can be defined as a tree
ordered in which each vertex has no more than two children and each child is designated either as
a left child should be like a right child of its parent. The subtree having its root at the left child
(right) of a vertex is called the left (right) sub-tree of that vertex. An example of a tree
binary is given in figure 1.12a.
In figure 1.12b, numbers are assigned to the vertices of the binary tree in Figure 1.12a.
Note that a number assigned to each parent node is greater than all the numbers in
its left subtree and smaller than all the numbers in its right subtree. Such trees
are called binary search trees. Binary trees and binary search trees
recherche ont une grande variété d’applications en informatique. En particulier, les arbres binaires
Research can be generalized to more general types of search trees called trees.
multimode research, which is essential for the efficient storage of very large files
on disk.
As we will see later, the efficiency of the most effective algorithms for trees
search binaries and their extensions depend on the height of the tree. Therefore, the

24
The following inequalities for the height of a binary tree with n nodes are special
important for the analysis of such algorithms:

logn 2 h n 1.

A binary tree is usually implemented for computational purposes by a collection of


nodes corresponding to the vertices of the tree. Each node contains information associated with the
node and two pointers on nodes representing the left child and the right child of the node,
respectively. Figure 1.23 illustrates such an implementation for the binary search tree of
figure 1.12b.

5 12

1 7 10

FIGURE 1.12–(a) Arbre binaire. (b) Arbre binaire de recherche

A computer representation of an arbitrary ordered tree can be made by giving


simply at each parent vertex a number of pointers equal to the number of its children. This
representation may be inappropriate if the number of lines varies greatly among the nodes. We
we can avoid this drawback by using nodes that have just two pointers, as for the
binary trees. Here however, the left pointer will point to the first child of the node, while
the right pointer will point to its next cousin. Therefore, this representation is called the
representation first son-first cousin (first child-next sibling representation). So, all the
Cousins of a vertex (through the right pointers of the nodes) are chained in a single list.
chaînée, le premier élément de la liste étant pointé par le pointeur gauche de leur parent. La figure
1.14a illustrates this representation for the tree in figure 1.11b.

5 12 nil

nil 1 nil 7 nil nil 10 nil

nil 4 nil
25
FIGURE 1.13 - Standard implementation of the binary search tree in Figure 1.12b

It is not difficult to see that this representation effectively transforms an ordered tree into
a binary tree is said to be associated with the ordered tree. This representation is obtained by rotating the
45° pointers clockwise (Figure 1.14b).

a nil

b nil d e nil

c nil g nil nil f nil

nil h nil i nil

(a)

c d

h g e

i f

(b)

FIGURE 1.14–(a) Representation first child-next cousin of the graph in Figure 1.11b. (b)
Its representation in the form of a binary tree.

1.4.4 Sets and dictionaries


The concept of a set plays a central role in mathematics. A set can be described as
An unordered collection (possibly empty) of distinct objects called elements of the set.
Un ensemble est défini soit en extension soit en compréhension. Les principales opérations sur les
sets are the verification of the membership of an element, the calculation of the union of two
sets, and the calculation of the intersection of two sets.
26
Les ensembles peuvent être implémentés dans les applications informatiques de deux façons. La
first considers only the sets that are subsets of a certain large set
U, called universal set. If the set U has n elements, then a subset S of U can
to be represented by a bit string of length n, called binary vector, in which the i-th
The element is 1 if and only if the ith element of U is included in S. This way of representing
sets allow for very quick implementation of classic set operations but at
detriment of the potential use of large amounts of memory.
The second most common way to represent a set is to use a list structure.
to indicate the elements of the set. However, let us note the main differences between the
sets and lists. First, a set cannot contain identical elements: a list
can do it. This uniqueness requirement is sometimes circumvented by the introduction of a set
multiple(multiset) ousac, an unordered collection of elements that are not necessarily
distincts. Secondly, a set is an unordered collection of objects, therefore
Changing the order of these elements does not change the whole. A list, defined as a collection
the ordering of elements is exactly the opposite. This is an important theoretical distinction, but
fortunately it is not important for several applications. It is also interesting to
mention that if a set is represented by a list; depending on the application at hand,
It may be interesting to keep the list sorted.
In computer science, the operations we often have to perform on a set or a bag
searching for a given element; adding a new element and removing an element
from the collection. A data structure that implements these three operations is called a
dictionary. Therefore, an efficient implementation of a dictionary must be a trade-off
between the effectiveness of research and the effectiveness of the other two operations. There are very few ways
to represent a dictionary. They range from a non-sophisticated use of vectors to
more sophisticated techniques such as hashing and balanced search trees; that we
let's study in this course.

Many applications require a dynamic partitioning of a set of elements into a


collection of disjoint subsets. After being initialized as a collection of n subsets
together with an element, the collection is subject to a series of mixed meetings and operations.
research. This problem is called the set union problem.
You may have noticed that in our review of basic data structures, we have
almost always mentioned specific operations that are typically performed for the
structure in question. This close relationship between data and operations has long been
recognized by computer scientists. It particularly brought them towards the notion of data type.
abstracts: a set of abstract objects representing data elements with a set
operations that can be applied to these objects. Although abstract data types can
to be implemented in procedural languages like Pascal, it is more convenient to do so
in object-oriented languages such as C++ or Java, which support abstract data types
by means of classes.

Exercises 1.4
Problem 1. Describe how we can implement each of the following operations on a
vector such that the time it takes does not depend on the size of the vector.
a) Remove the i-th element from a vector 1 i n )
b) Remove the i-th element from a sorted vector (the remaining vector must of course remain sorted)
sorted)

27
Problem 2. If you solve the search problem in a list of numbers, how
pouvez-vous prendre avantage du fait que la liste est connu être trié ? Donnez des réponses séparées
for :
a) lists represented in vector form.
b) the lists represented as a linked list.
Problem 3. a) Show the content of the stack after each operation in the following sequence
starting with the empty stack:
push(a), push(b), pop, push(c), push(d), pop
b) Show the contents of the queue after each operation of the following sequence in
starting with the empty queue:
enqueue(a), enqueue(b), dequeue, enqueue(c), enqueue(d), dequeue

Problem 4.a) Let A be the adjacency matrix of an undirected graph. Explain what property
from the matrix indicates that:
i. The graph is complete
ii. The graph has a loop, that is, an edge connecting a vertex to itself.
iii. The graph has an isolated vertex, that is to say a vertex that has no incident edge.

b) Answer the same questions for the representation in the form of adjacency lists.

Problem 5. Provide a complete description of an algorithm that transforms a free tree into a
tree whose root is located at a given summit of the free tree.

Problem 6. Indicate how the abstract data type priority queue can be implemented.
as
an (unsorted) vector
b) one sorted vector
c) a binary search tree.

Problem 7. How will you implement a reasonably small dictionary?


Do you know that all these elements are distinct? Specify an implementation of each of them.
dictionary operations.

Problem 8. Anagram verification. Design an algorithm to check if two words


Given are anagrams, that is to say one of the words can be obtained by permuting the letters of
the other.

28

You might also like