[Some questions about implementation] 

Hi, I'm Junmo Cho.

I've read the paper which was pretty interesting. Sorry for taking your time, but while running the code, I've got some questions.

1. Is minus of binary_cross_entropy between img, and pred_img coming from assuming the reward distribution as Bernoulli distribution? I thought that for each pixel in img (which is gt target, and value is 1 or 0) is used as Ber(y|pi) = pi^y * (1-pi)^(1-y) where y is pixel and from for each pixel dist in pred_img, we input it for pi. Please correct me if my understanding is wrong. 
2. Another thing is why do we divide steps (which is length of generation sequence of GFN - 16 here) for logprobs, and reward when calculating TB loss? I thought that logprobs is itself log of production of P_F(s_i | s_{i-1}) from i=1 to n as in the paper.
3. Also, why there is no backward policy term in the TB loss? Are we assuming backward policy as uniform and involve it in logZ?  

It would be grateful if I can have some answers! Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Some questions about implementation] #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Some questions about implementation] #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions