Ray Xu

Results 9 comments of Ray Xu

have you found that, decoder outputs have 1 less time steps? Strange

I have implemented beam search in my fork. Similar operation as this transformer despite of some modification.

my code in under adjustment. I can give you previous version of beam search. this version is slow and simple, but still works. ```python def beam_search(x, sess, g, batch_size=hp.batch_size): inputs...

yes ```python if is_training or hp.beam_size == 1: self.preds = tf.to_int32(tf.argmax(self.logits, axis=-1)) self.istarget = tf.to_float(tf.not_equal(self.y, 0)) self.acc = tf.reduce_sum(tf.to_float(tf.equal(self.preds, self.y)) * self.istarget) / (tf.reduce_sum(self.istarget)) else: print('[WARNING] beam search enabled') assert...

Transformer use Layer Normalization rather than batch normalization. Layer Normalization need not consider the batch information. see [Layer Normalization](https://arxiv.org/pdf/1607.06450.pdf) at the end of page 2

> > Transformer use Layer Normalization rather than batch normalization. Layer Normalization need not consider the batch information. see [Layer Normalization](https://arxiv.org/pdf/1607.06450.pdf) at the end of page 2 > > However,...

Cool! I love pytorch.

index-tts 2.0快出来了,不知道会不会有新的比较好的解决方案