policy_graph {pomdp} | R Documentation |
The function creates and plots the POMDP policy graph in a converged POMDP solution and the
policy tree for a finite-horizon solution.
uses plot
in igraph with appropriate plotting options.
policy_graph(x, belief = NULL, show_belief = TRUE, col = NULL, ...)
plot_policy_graph(
x,
belief = NULL,
show_belief = TRUE,
legend = TRUE,
engine = c("igraph", "visNetwork"),
col = NULL,
...
)
estimate_belief_for_nodes(x, epoch = 1, ...)
x |
object of class POMDP containing a solved and converged POMDP problem. |
belief |
the initial belief is used to mark the initial belief state in the
grave of a converged solution and to identify the root node in a policy graph for a finite-horizon solution.
If |
show_belief |
logical; estimate belief proportions? If |
col |
colors used for the states. |
... |
parameters are passed on to |
legend |
logical; display a legend for colors used belief proportions? |
engine |
The plotting engine to be used. For |
epoch |
estimate the belief for nodes in this epoch. Use 1 for converged policies. |
Each policy graph node represent a segment (or part of a hyperplane) of the value function. Each node represents one or more believe states. If available, a pie chart (or the color) in each node represent the central belief of the belief states belonging to the node (i.e., the center of the hyperplane segment). This can help with interpreting the policy graph.
For converged POMDP solution a graph is produced, for finite-horizon solution a policy tree is produced. The levels of the tree and the first number in the node label represent the epochs. Many algorithms produce unused policy graph nodes which are filtered to produce a clean tree structure. Non-converged policies depend on the initial belief and if an initial belief is specified, then different nodes will be filtered and the tree will look different.
First, the policy in the solved POMDP is converted into an igraph object using policy_graph()
.
Average beliefs for the graph nodes are estimated using estimate_belief_for_node()
and then the igraph
object is visualized using the plotting function igraph::plot.igraph()
or,
for interactive graphs, visNetwork::visIgraph()
.
estimate_belief_for_nodes()
estimated the central belief for each node/segment of the value function
by generating/sampling a large set of possible belief points, assigning them to the segments and then averaging
the belief over the points assigned to each segment.
Additional parameters like method
and the sample size n
are passed on to sample_belief_space()
.
If no belief point is generated for a segment, then a
warning is produced. In this case, the number of sampled points can be increased.
policy_graph()
returns the policy graph as an igraph object.
plot_policy_graph()
returns invisibly what the plotting engine returns.
estimate_belief_for_nodes()
returns a matrix with the central belief for each node.
Other policy:
optimal_action()
,
plot_value_function()
,
policy()
,
reward()
,
solve_POMDP()
,
solve_SARSOP()
data("Tiger")
## policy graphs for converged solutions
sol <- solve_POMDP(model = Tiger)
sol
policy_graph(sol)
## visualization
plot_policy_graph(sol)
## use a different graph layout (circle and manual; needs igraph)
library("igraph")
plot_policy_graph(sol, layout = layout.circle)
plot_policy_graph(sol, layout = rbind(c(1,1), c(1,-1), c(0,0), c(-1,-1), c(-1,1)))
## hide labels and legend
plot_policy_graph(sol, edge.label = NA, vertex.label = NA, legend = FALSE)
## add a plot title
plot_policy_graph(sol, main = sol$name)
## custom larger vertex labels (A, B, ...)
plot_policy_graph(sol,
vertex.label = LETTERS[1:nrow(policy(sol)[[1]])],
vertex.label.cex = 2,
vertex.label.color = "white")
## plotting the igraph object directly
## (e.g., using the graph in the layout and to change the edge curvature)
pg <- policy_graph(sol)
plot(pg,
layout = layout_as_tree(pg, root = 3, mode = "out"),
edge.curved = curve_multiple(pg, .2))
## changes labels
plot(pg,
edge.label = abbreviate(E(pg)$label),
vertex.label = V(pg)$label,
vertex.size = 20)
## plot interactive graphs using the visNetwork library.
## Note: the pie chart representation is not available, but colors are used instead.
plot_policy_graph(sol, engine = "visNetwork")
## add smooth edges and a layout (note, engine can be abbreviated)
plot_policy_graph(sol, engine = "visNetwork", layout = "layout_in_circle", smooth = TRUE)
## estimate the central belief for the graph nodes. We use the default random sampling method with
## a sample size of n = 100.
estimate_belief_for_nodes(sol, n = 100)
## policy trees for finite-horizon solutions
sol <- solve_POMDP(model = Tiger, horizon = 4, method = "incprune")
policy_graph(sol)
plot_policy_graph(sol)
# Note: the first number in the node id is the epoch.
# plot the policy tree for an initial belief of 90% that the tiger is to the left
plot_policy_graph(sol, belief = c(0.9, 0.1))