Skip to content

Regarding the class att_rnn #6

Open
@infected4098

Description

@infected4098

Hi. This is Yong Joon Lee. I am implementing LAS model based on your code. I know you might not remember the actual code cuz obviously you implemented it 3 years ago. But I think I found out that class att_rnn might have a tiny mistake in code ordering. If you see the class att_rnn's call part. you define s twice in a row then move onto c, which is a attention context.

your ordering is as below:

s       = self.rnn(inputs = inputs, states = states) # s = m_{t}, [m_{t}, c_{t}] #m is memory(hidden) and c is carry(cell)
s       = self.rnn2(inputs=s[0], states = s[1])[1] # s = m_{t+1}, c_{t+1}
c       = self.attention_context([s[0], h])

but isn't it supposed to be as below?

s       = self.rnn(inputs = inputs, states = states) # s = m_{t}, [m_{t}, c_{t}]
c       = self.attention_context([s[0], h]) 
s       = self.rnn2(inputs=s[0], states = s[1])[1] # s = m_{t+1}, c_{t+1}

As the original paper suggests, attention context vector at timestep t is made by applying attention to the s_t and h, where h is a result of pBLSTM. But I think by your way of ordering you are deriving attention context vector from s_{t+1} and h. Thank you for your great work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions