How hoes “state” influences the output? #2

AmuroPeng · 2021-04-18T15:57:07Z

Hi tocom242242, sorry to interrupt you, I found you did really great works on MARL and I followed your GitHub recently. I’m interested in this minimax_q_learning repo, may I ask you a quick question about “state”?

It seems like in your code there is only one state, which is the default state “nonstate”, and you set dict to save q, pie and v matrixes for each state separately. It runs correctly when there is one state, but when I tried multiple states, I don’t know how it can influence the output. I was wondering if this “state” is the same as the state of q learning since I guess the state in q learning is something like a combination of the opponent’s previous action and my previous action. Based on the state S(a,a’) , the Q matrix can tell in this state S1 if I choose my action a1, the Q value would be xxx; in this state S1 if I choose action a2, the Q value would be yyy. But when I try to understand the state in your repo, it seems each state has a Q matrix, and the state is only determined by the opponent’s action.

I would be grateful if you could let me know how the state works, I really appreciate that!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How hoes “state” influences the output? #2

How hoes “state” influences the output? #2

AmuroPeng commented Apr 18, 2021

How hoes “state” influences the output? #2

How hoes “state” influences the output? #2

Comments

AmuroPeng commented Apr 18, 2021