Skip to content

[Some questions about implementation]  #4

Open
@junmokane

Description

@junmokane

Hi, I'm Junmo Cho.

I've read the paper which was pretty interesting. Sorry for taking your time, but while running the code, I've got some questions.

  1. Is minus of binary_cross_entropy between img, and pred_img coming from assuming the reward distribution as Bernoulli distribution? I thought that for each pixel in img (which is gt target, and value is 1 or 0) is used as Ber(y|pi) = pi^y * (1-pi)^(1-y) where y is pixel and from for each pixel dist in pred_img, we input it for pi. Please correct me if my understanding is wrong.
  2. Another thing is why do we divide steps (which is length of generation sequence of GFN - 16 here) for logprobs, and reward when calculating TB loss? I thought that logprobs is itself log of production of P_F(s_i | s_{i-1}) from i=1 to n as in the paper.
  3. Also, why there is no backward policy term in the TB loss? Are we assuming backward policy as uniform and involve it in logZ?

It would be grateful if I can have some answers! Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions