AMS 556, Dynamic Programming
Stochastic and deterministic multistage optimization problems. Stochastic path problems.
Principle of optimality. Recursive and functional equations. Method of successive
approximations and policy iteration. Applications to finance, economics, inventory
control, maintenance, inspection, and replacement problems.
Prerequisite: AMS 507
3 credits, ABCF grading
Text: Lecture Notes available on Brightspace
Fall Semester
Learning Outcomes:
1) Understanding of finite-horizon dynamic programming problems and general principles
of sequential optimization:
* Being able to formulate dynamic programming problems;
* Understanding of definitions of various classes of policies;
* Familiarity with definitions of optimal and nearly optimal policies;
* Knowledge of basic transformation and representation methods such as equivalence
of general and memoryless policies, equivalence of randomized and mixed policies,
and sufficiency of pure Markov policies;
* Understanding of the principle of optimality and optimality equations.
2) Demonstrate abilities to analyze infinite-horizon problems with total reward criteria:
* Formulate properties of positive, negative, discounted, and convergent dynamic
programming models;
* Being able to formulate necessary and sufficient conditions for optimality;
* Ability to solve problems by using value iterations and linear programming;
* Ability to produce numerical solutions by using PCs.
3) Demonstrate abilities to analyze infinite-horizon problems with average-reward
criteria:
* Solving problems with finite state and action sets by using linear programming
and policy iteration methods;
* Analyzing problems with infinite-state spaces;
* Knowledge of optimality equations and inequalities;
* Solving problems with multiple criteria and constraints by linear programming.
4) Being familiar with major application areas:
* Modeling and optimization of production and service operations including inventory
and queueing control;
* Applications to quantitative finance;
* Statistical applications (multi-armed bandits and optimal selection);
* Neuro-dynamic programming and other methods for approximations.