Browse Source
Now aligns with the default implementation in explicit.MDP: If there is a choice consisting of a self-loop, produces value 0 for zero-reward state/choice and infinite value for positive-reward state/choice (should be catched before in precomputations). The previous behaviour (self-loops have infinite value) messes with maximal total reward computations with Gauss-Seidel, e.g. prism functionality/verify/mdps/rewards/total-reward-2.nm functionality/verify/mdps/rewards/total-reward-2.nm.props -explicit -gs -test -prop 3 for the PRISM test suite fails. Also, align comments in MDP.mvMultRewJacMinMaxSingle.master
committed by
Dave Parker
2 changed files with 15 additions and 4 deletions
Loading…
Reference in new issue