liddle rain
liddle rain
#### Description I have known that the time-series method coould be rerversed but i don't find any solutions to reverse.Is there any difficulty?
when i do a muti-output regression task,it says that  but i think the shape is matching.
 New edition expression
`The models are trained with a batch size of 32000 tokens on 8 Tesla V100 GPUs.` In the paper, it is provided about this information. Is it posibble to know...
## What do these changes do? make the code support for Where Operation
There is some work already shown when we use fintuned task, the embedding space will become a cluster. So if we have different information attention, could we use supervised task...
Does it still work now? I have tried, but it didn't
Where is the dataset?