What they Learn
A1
World
state
actions
reward
0-Level
A2
A3
A4
state
action
A1
A2
A3
A4
A1
A3
A1
A3
A2
A1
A2
A3
2-Level
1-Level
Not Feasible to imple-ment. Approximate?
Models others as
State->Action Probability Pairs
Reinforcement Learning
Previous slide
Next slide
Back to first slide
View graphic version