Open
Description
Currently, fictitious play stores count vectors and recomputes each time the payoff of each move by matrix multiplication before selecting the best move.
-
move = payoff.argmax()
costs$\mathcal{O}(n)$ -
count[move] += 1
costs$\mathcal{O}(1)$ -
payoff = matrix @ count
costs$\mathcal{O}(n * m)$
Instead, one could just update the payoff vector with payoff += matrix[:, move]
in
Even better, the matrix can be stored transposed to be able to write matrix[move]
which is more cache efficient.
Metadata
Metadata
Assignees
Labels
No labels