Releases: neoplites/pomdp
Releases · neoplites/pomdp
Shared 2D Matrix maximum finder
The problem: Set up a two dimensional matrix and find its maximum value.
The algorithm: Two learning agents generate matrix coordinates independently, one along each dimension of the matrix. The agents associate probabilities with each choice of coordinates and they update those probabilities on each iteration. Once the solution converges (the probabilities of a particular choice of coordinates approach 1.0), the maximum is found. This solution does not guarantee finding the absolute maximum, it can get stuck in a local maximum.