policy {pomdp} | R Documentation |
Extracts the policy from a solved POMDP/MDP.
policy(x)
x |
A solved POMDP object. |
A list (one entry per epoch) with the optimal policy. For converged, infinite-horizon problems solutions, a list with only the converged solution is produced. The policy is a data.frame consisting o:
Part 1: The value function with one column per state. For POMDPs these are alpha vectors and for MDPs this is just one column with the state.
Part 2: One column with the optimal action.
A list with the policy for each epoch.
Michael Hahsler
Other policy:
optimal_action()
,
plot_value_function()
,
policy_graph()
,
reward()
,
solve_POMDP()
,
solve_SARSOP()
data("Tiger")
# Infinite horizon
sol <- solve_POMDP(model = Tiger)
sol
# policy with value function, optimal action and transitions for observations.
policy(sol)
plot_value_function(sol)
# Finite horizon (we use incremental pruning because grid does not converge)
sol <- solve_POMDP(model = Tiger, method = "incprune", horizon = 3, discount = 1)
sol
policy(sol)
# Note: We see that it is initially better to listen till we make a decision in the final epoch.