0% found this document useful (0 votes)
27 views31 pages

Dynamic Programming Overview and Concepts

Dynamic Programming (DP) is an optimization method used to solve complex problems by breaking them down into simpler subproblems, which can be either deterministic or stochastic. Key concepts include stages, states, decision variables, and recurrence functions, with applications in resource allocation, production planning, and route optimization. The principle of optimality, established by Richard Bellman, underpins the method, allowing for efficient solutions through recursive calculations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views31 pages

Dynamic Programming Overview and Concepts

Dynamic Programming (DP) is an optimization method used to solve complex problems by breaking them down into simpler subproblems, which can be either deterministic or stochastic. Key concepts include stages, states, decision variables, and recurrence functions, with applications in resource allocation, production planning, and route optimization. The principle of optimality, established by Richard Bellman, underpins the method, allowing for efficient solutions through recursive calculations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Session 7 - Dynamic Programming

Generalities
Dynamic Programming (DP) is an optimization method that can be applied to
different problems, some of which have already been addressed through programming
linear or graphical method.
Depending on the nature of the parameters of the mathematical model of the problem, it
distinguish the:

Deterministic PD: it is an optimization method used for


solve various mathematical programs. This method reaches the
optimal solution through the division of the problem into subproblems
smaller and more manageable, which allows for a more
efficient.

Stochastic PD: it is an optimization technique that is used to


make sequential decisions in situations of uncertainty,
where one or more parameters of the problem are modeled through
of random variables.
It was developed by the American mathematician Richard Bellman (1920 – 1984)
during the 1950s when I was working on the development of the Theory of
the decision processes in multiple steps.
Session 7 - Dynamic Programming

SOME APPLICATIONS

Distribution and allocation of resources


Production and inventory planning
Machine Replacement Policy
Determination of the shortest route from one point to another
Sesión 7 – Programación Dinámica
Motivation

Problem Solution
PD

How many solutions are there?


Example: How many ways can one go from the origin city to the destination city?

Combinatorial Solution
PD
Optimization
What is the best solution?
Example: What is the route with the shortest distance to get from the origin city to the destination city?
Session 7 – Dynamic Programming

Dynamic Programming corresponds to a method for solving problems,


part of:

1 The solution of subproblems

2 The combination of those solutions


Session 7 - Dynamic Programming

So, instead of proposing a solution to this problem, let's propose


reduced scenarios:

In that case, player B is the


The first scenario What is going to lift the last one
for example stick and it's going to lose.

If this scenario occurs, it is a


favorable scenario for
The PLAYER always wins
player A always wins
Session 7 – Dynamic Programming

If in this case player B lifts a stick and it


he leaves one to player A, in this scenario he is going to lose the
player A.

And player B will also win if there are 3 left,


The PLAYER B always wins he will raise two and leaves one to player A

And player B is also going to win if there are 4 left,


he is going to lift three and leaves one to player A

These son
scenarios that the
Player A must
avoid in order to
to win.
Session 7 - Dynamic Programming

In this way, in any of these three possibilities, I would leave player B a


Palito, therefore this scenario 5 is a scenario where player A will always win.
Session 7 - Dynamic Programming

The winning strategy of Player A consists of leaving the following on the table
quantities of matchsticks for Player B's turn.

1, 5, 9, 13, 17, 21, 25, 29

If player A gets to start the game, meaning it is his turn first, he is going to
pick up a matchstick.
30 - 1 = 29
Therefore, we have the first number corresponding to the winning strategy.

Later try to leave on the table the amounts of matches indicated and that
It would be the winning strategy.

Observation: It is easier to solve simpler subproblems arising from the


original problem.
Session 7 – Dynamic Programming

Basic Concepts of Dynamic Programming


Let's remember: It is easier to solve simpler subproblems arising from the
original problem and combine these solutions to solve the initial problem.

Principle of:

Let's dive deeper into the characteristics, into the foundations of this method:

1. Properties of PD problems
2. Basic concepts of PD
3. Solution procedure
4. Summary
Session 7 – Dynamic Programming

Properties of Dynamic Programming Problems

Dynamic Programming is a method for solving problems, starting from,


from the solution of subproblems and the combination of those solutions.

1 Optimal substructure

It means that optimal solutions of subproblems can be used to


determine the optimal solution to the problem as a whole.

2 Overlapping or superposition of subproblems

It means that the same subproblem can be used to solve problems.


older adults

So let's illustrate the idea of the second principle (Overlapping or superposition)


of subproblems
Session 7 - Dynamic Programming

2 Overlapping or superposition of subproblems

It means that the same subproblem can be used to solve problems.


greater

Let's think of a very well-known result in mathematics which is the series of


Fibonacci.
Fibonacci series The Fibonacci series is a
infinite mathematical sequence
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, … in which each number is the
sum of the two numbers
n = 1, 2, 3, 4, 5, 6, 7, … previous. The sequence
starts with the numbers 0 and
Notice that if we think of a number n (integer) 1, and from there, each
It follows that the Fibonacci number of 1 is 1. number is the sum of the two
previous numbers.
The Fibonacci number of 2 is also 1.
The one of 3 is 2.
But what does it give us, if we think of two initial conditions, that the
The one of 4 is 3. Fibonacci of 1 is equal to Fibonacci of 2 which is 1, the Fibonacci numbers
The one of 5 is 5. The remaining ones are obtained by adding the two previous ones.
The one from 6 is 8.
The one of 7 is 13 and so on.
Session 7 - Dynamic Programming

2 Overlapping or superposition of subproblems

It means that the same subproblem can be used to solve problems.


adults

Fibonacci series
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, …
n = 1, 2, 3, 4, 5, 6, 7, ...
therefore we can collect these observations in the following way:
Fibonacci function 2 initial conditions Recursion formula

Fib(n) Fib(1) = Fib(2) = 1 Fib(n) = Fib(n - 1) + Fib(n - 2)


For example, if we want to calculate the Fibonacci of 5 For example, what we notice in this
ejemplo, que el Fib(2) se repiten, y
Fib(5) they appear in several stages.

This means that it is being given the


Fib(4) Fib(3)
overlap or superposition of
subproblems, it means that I
Fib(3) Fib(2) Fib(2) Fib(1) I can store Fib(2) and use it.
in other instances.
Fib(2)Fib(1)
Session 7 - Dynamic Programming

Basic concepts of Dynamic Programming


LaPD corresponds to a method that allows for optimal solutions to be obtained in a process.
divided into stages.

Associated with this method are distinguished the following concepts:

Stages
States Model
Mathematician
Decision variables of the Problem
Recurrence function
Principle of optimality

We are going to describe each of these concepts:


Session 7 - Dynamic Programming

Stages
They are the period of time, the place, the phase or the situation in which it occurs.
a change due to a decision.
Solo puede tomarse una decisión en cada etapa, es decir para pasar a la
next stage.
To have a graphic idea of what is happening:

n=1 n=2 n=3



EVOLUTION OF THE SYSTEM

This is the concept of stages, and it is here where at each stage we will propose the
subproblem and then the combination of the solutions of the subproblems, the idea
divide and conquer.
Session 7 – Dynamic Programming

Remember that the subindex refers to stage, that is, if


States we are in what will be the state of stage 2

They show the current situation of the system when it is in the stage.

Each stage has its state.

To have a graphic idea of what is happening:

n=1
1
n=2
2
n=3
3

EVOLUTION OF THE SYSTEM

With each stage, we will have an associated state variable.


Session 7 - Dynamic Programming

Remember that the subscript refers to the stage.


Decision variables that is to say that if we are in it will be the variable of
decision stage 2
It corresponds to the decision made in a stage that causes a change.
in the system state
Each stage has its variable of decision.

To have a graphic idea of what is happening:

n=1
1
n=2
2
n=3
3 …
EVOLUTION OF THE SYSTEM

We have each stage with its status and then for the stage changes they make
It is necessary to make decisions; at each stage, a decision variable will intervene.
what is going to allow the change of lid and condition.
Session 7 - Dynamic Programming
Recurrence function
Describe the behavior of the system based on the states and variables.
decision.
Each stagetits own associated recurrence function a que v ade p eunder the
states and the decision variables of each stage. ( , )
Let's see how PD problems are solved, one can start from the beginning.
until the end or from the end to the beginning, that is forward or backward, by
In general, it will always go backward (from the end to the beginning).

Considering the above, the recurrence function is associated with each stage.

n=1
1
n=2
2
n=3
3 …
It will only cover or depend on stage 3. 3 ( 3 , 3 )
Well, it will depend on the current situation, on the state we are in.
considering and the state that has already been considered. 2 ( ,
2 2 )
Well, it will depend on the current situation, of the
state that we are considering and of the state 1 ( 1 , 1)
which was already considered in 2 and 3, since it will go
encompassing everything that is being obtained up to the
momento.
Session 7 - Dynamic Programming
Optimality principle (Richard Bellman)
With the concepts seen, Stage, State, Decision Variables, and recurrence function,
are part of the mathematical model. The principle of optimality attributed to Richard
Bellman is what provides sustenance, it is the essence of the DP theory.

Why justifies the solution of a complex problem adequately decomposed into


stages using recursive calculations, that recurrence formula that allows us to transition
from one stage to another, knowing the solution from one stage to another.

So the principle of optimality that was postulated is:

Given an optimal sequence of decisions, any subsequence of it is, in turn,


optimum

Here, general optimality and specific optimality are being characterized.


particularly in each subproblem, each subsequence.
Session 7 - Dynamic Programming
Solution procedure
NOTE: To solve a PD problem, one can proceed:

1 Inretreat
The terminal conditions are fixed and the computation of the values
Numerical analysis is carried out from the final stage to the initial stage.

2 Inadvance
The initial conditions are fixed and the computation of the values
Numerical analysis is carried out from the initial stage to the final stage.

So generally in PD the backtracking resolution method is used.


Session 7 - Dynamic Programming
Solution procedure
STEPS

1 Identify the stages, states, and decision variables.

2 Describe the recurrence equations


Analysis (Step by step solution)
3 Resolver
Decision (Combination of stage solutions)
Session 7 - Dynamic Programming
Example - PD (Shortest Path)
There is a desire to go from New York City to Los Angeles, California, by road.
Therefore, our starting node is New York and the destination node is Los Angeles - California.
Session 7 - Dynamic Programming
Example – PD (Shortest Path)
It is planned to organize the trip by days, so the stages in our problem will be the days.
On the first day, it is expected to reach one of the following three cities: Columbus (Ohio), Louisville
Kentucky and Nashville (Tennessee)

C
L
N
Session 7 - Dynamic Programming
Example – PD (Shortest Path)
On the second day, it is expected to reach one of the following three cities: Omaha (Nebraska), Kansas
City (Kansas) or Dallas (Texas)
On the third day, it is expected to arrive at one of the following two cities: Denver (Colorado) or San Antonio.
(Texas)
On the last day or fourth day, we would connect with the Angels (California), which would be our destination city.

C
D O
L
K
N

S D
Session 7 - Dynamic Programming
Example - PD (Shortest Path)
DAY 1 DAY 2 DAY 3 DAY 4
Stage 1 Stage 2 Stage 3 Stage 4

Columbus 680 Kansas City


610
2 5
790
790
1050
500 Denver
8 1030
580 540
900
New York Nashville Omaha The
1 3 760 6 Angels
600 940 10

770 San 1390


790 Antonio
510
9
700
Louisville Dallas 270
4 830 7

In which cities should one spend the night each day so that the distance traveled between NY and LA
minimum sea?
Although we can solve this problem through a network model, today we are going to solve it by means of
PD.
Sesión 7 – Programación Dinámica
Example – PD (Shortest Path)
Resolution of the problem - In retrogression
That is to say, it starts from stage 4 (Los Angeles)

DAY 4 Stage =4→î4


Stage 4

9
State 8
Saint
Denver Antonio
Denver
8 1030

The ( Length
) of the shortest path from the city to Los
Angels in the stage, this function gathers the optimum, it is
Angels
say the shortest length, this function accumulates
10
distances, in this particular stage 4, we have
San 1390 the following options:
Antonio
9 (48 )=1030
(49 )=1390
We must establish the elements that
establish the mathematical model of PD:
Session 7 - Dynamic Programming
Example – PD (Shortest Path)
Resolution of the problem - In reverse

5 6 7 KC From
+ (48 )=610+1030=1640
DAY 3 KC O D KC 58
(35 ) =mim {1640,2180 }=1640
5
Stage 3 KC SA
59 + (49 )=790+1390=2180
Kansas City
610
5
790 O Of
Denver + ( 84 )=540+1030=1570
O 68
8
6
(36 )=mim {1570,2330 }=1570
540 O SA
Omaha 69 + ( 94 )=940+1390=2330
6
940
San Yes Of
790 Antonio
D 78 + (48 )=790+1030=1820
9
7
(37 ) =mim {1820,1660 }=1660
Yes SA
Dallas 270
7 79 + (49 )=270+1390=1660

Note that we have already calculated the optimal solutions for this stage if we wake up in Kansas City, Omaha, or Dallas.
Session 7 - Dynamic Programming
Example - DP (Shortest Path)
Resolution of the problem - In reverse
C KC C O C Yes
C KC
=mim680+
2 3 4
(22
) { ( ) ,790+ ( ) 1050+ ( ) }
DAY 2 C C O
C N L 2
Stage 2 (22 )=mim {680+ ,790+1000+ }
C Yes
Colombus
2
790
680 Kansas City
5
(22 )=mim {2320,2360,2710 }
=2320
N KC N O N Yes
1050 N KC

N
{
(23 ) =mim580+ ( ) ,760+ ( ) ,600+ ( )}
N O
580 3
(23 ) =mim {580+ {"value":760}+600+ }
Nashville Omaha N Yes
3 760 6 (23 ) =mim {2220,2330,2260 }=2220
600
L KC L O L Yes
L KC
510
L L O
(24 )
=mim510+
{ ( ) ,700+ ( ) ,830+ ( )}
700 4
Louisville Dallas (24 )=mim {510+ A time, 700+0.3+ }
4 830 7 L Yes
(24 )=mim {2150,2270,2490 }=2150
So if we are in stage 2, if we start in any of the three cities (C, N, and L), the best or optimal thing is
to head towards Kansas City
Session 7 - Dynamic Programming
Example – PD (Shortest Path)
Problem resolution - In regression

DAY 1 New York

Stage 1
Columbus
2

500

NY C
New York
900
Nashville NY
(11
) =mim550+
{ ( ) ,900+ ( ) 770+ ( ) }
NY N
1 3 1 (11 )=mim {550+915,900+ ,1050+ }

770
NY L (11 )=mim {2870,3120,2970 }
=2870
Louisville
4

So we have already found the optimal solutions by stage, now we are going to combine these solutions, which will give us the
optimal route.
Session 7 – Dynamic Programming
Example - PD (Shortest Path)
Optimal route 12 + ( ) =550+ =2870
(1 )=mim
1 13 +

14 +
=900+
( )
( ) =770+
=3120
=2920
NY C

C KC
2 5
25 + ( )
=680+ =2320
Of ( 2 )=mim
2 26 + ( )
=790+ =2360 C KC
NY
1
N
3
O
6
8 LA
10
27 + ( )
=1050+ =2710
SA
L Yes
9
( 5 ) =mim
58+ ( )
=610+ =1640 KC Of
=790+ =2180
3
4 7
59 + ( )

( 8 )=1030
4 Of LA

These are the nodes or the cities that we need to connect in our optimal route.
Session 7 - Dynamic Programming
Example - PD (Shortest Path)

550
z = 2870 Miles
1030 680 C
D O
L
610 K
N

S D
CONCLUSIONES

• Dynamic programming is a very useful technique for making interrelated decisions in


complex situations. It allows the division of the problem into smaller subproblems and
manageable, which allows for a more efficient solution.

• Dynamic programming is especially useful in situations where uncertainty plays a role.


an important role. It allows for sequential decision-making in situations of
uncertainty, where one or more parameters of the problem are modeled through
random variables.

• Dynamic programming has many applications in operations research.


including the planning and control of production, the management of assets and liabilities of
pension funds, the scheduling of operations in factories, among others. The
Dynamic programming allows for the optimization of inventory levels, the management of the
supply chain, production planning and informed decision making in
general.

You might also like