Skip to content

ReplayBuffer - a subtle bug around head pointer boundary #52

Open
@gordicaleksa

Description

@gordicaleksa

Hi!

Looking at the _encode_observation function it seems you have a subtle bug in there.

Namely, you're only handling the start_idx edge case where the buffer is still not full and start_idx is negative.

But even in the case where the buffer is full but start_idx crosses the buffer's head pointer boundary you'll be stacking fresh experience with super old experience (especially in a 1M slot buffer).

The bigger the buffer the less probable this event would be, and even if it happened, since it's a low-frequency event it won't affect the Q-learning function but I still thought flagging it.

Since I'm implementing my own version of DQN here is a snippet of how I handle the start index:

    def _handle_start_index_edge_cases(self, start_index, end_index):
      # Edge case 1:
      if not self._buffer_full() and start_index < 0:
          start_index = 0

      # Edge case 2:
      # Handle the case where start index crosses the buffer head pointer - the data before and after the head pointer
      # belongs to completely different episodes
      if self._buffer_full():
          if 0 < (self.current_free_slot_index - start_index) % self.max_buffer_size < self.num_previous_frames_to_fetch:
              start_index = self.current_free_slot_index

where my num_previous_frames_to_fetch is your frame_history_len.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions