Skip to content

Wrong info in Gym tutorial #239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
njustesen opened this issue Jun 10, 2022 · 6 comments
Open

Wrong info in Gym tutorial #239

njustesen opened this issue Jun 10, 2022 · 6 comments

Comments

@njustesen
Copy link
Owner

I don't believe this is true in https://njustesen.github.io/botbowl/gym.html:

"The action space is discrete, the action is an int in the range 0 <= action_idx < len(action_mask)."

@mrbermell
Copy link
Collaborator

Can you elaborate? From what I can tell it's correct.

@njustesen
Copy link
Owner Author

njustesen commented Jun 10, 2022

If I can choose between blocking player A or player B, the sentence in the tutorial says that my action integer can be 0 or 1 but since it is a spatial action it has to be higher than the number of number of non-spatial actions to get past if action_idx < len(self.env_conf.simple_action_types): in _compute_action(self, action_idx: Optional[int], flip: Optional[bool] = None) -> List[Optional[Action]]:.

Instead, the integer is in the range [0, len(action_space)] which is implicit.

@njustesen
Copy link
Owner Author

Unless len(action_mask)=len(action_space) but that's just confusing, right?

@mrbermell
Copy link
Collaborator

Thanks for the clarification, I see your point and agree. We should explain how the action mask works here. I'll see what I can do!

@mrbermell
Copy link
Collaborator

How about something along these lines?

Action space

In botbowl's core engine all actions have a type, and some of the types also require a position. Read more about actions in the scripted bot tutorials. The gym environment has unrolled the spatial dimension into a one dimensional action space (see picture below). By doing so it becomes easy to use state-of-the-art algorithms, but it's worth considering that compared to many of the standard reinforcement learning benchmarks we have orders of magnitude larger action space.

The action of the environment in an integer, let's say action_idx = 352. You call env.step(action_idx) to step the environment with your action. But not all actions are legal at all times, this is where the action mask comes in. The action_mask is a vector of booleans that represents the legal actions, to check if your action action_idx is legal simply check if action_mask[action_idx] is true.

image

@njustesen
Copy link
Owner Author

This is better!

If the scripted bot tutorials contain important info about the action space, I think it should be included here. What are the paragraphs you are thinking of?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants