why is MIXTURE_SIZE set to 1

Open Mlydi opened this issue 1 year ago • 1 comments

Hello,

Thanks for your nice repo. I noticed that MIXTURE_SIZE is set to 1 in your provided example command.

self.mixture_components = nn.Embedding(config.mixture_size, hidden_size)

I feel curious why mixture_size is not the size of vocabs?

Apr 09 '24 11:04 Mlydi

Sorry I just noticed this issue... It's because the mixture approach is only used on GSM8K but not on multiplication. Multiplication's CoT is deterministic given the input by design, so it's not necessary to use the mixture approach (setting mixture size to 1 is the way we disable the mixture approach).

Jul 10 '24 07:07 da03