Results 1 comments of isEdge

> also i find sometimes used gumbel-softmax(hard=false) will show a better result for train,but it is a bad set for use hard=false? if codebook is limited, will mixed token show...