Skip to content

Releases: neoplites/pomdp

Shared 2D Matrix maximum finder

15 Oct 18:58
Compare
Choose a tag to compare

The problem: Set up a two dimensional matrix and find its maximum value.
The algorithm: Two learning agents generate matrix coordinates independently, one along each dimension of the matrix. The agents associate probabilities with each choice of coordinates and they update those probabilities on each iteration. Once the solution converges (the probabilities of a particular choice of coordinates approach 1.0), the maximum is found. This solution does not guarantee finding the absolute maximum, it can get stuck in a local maximum.