liddle rain

Results 8 issues of liddle rain

#### Description I have known that the time-series method coould be rerversed but i don't find any solutions to reverse.Is there any difficulty?

when i do a muti-output regression task,it says that ![图片](https://user-images.githubusercontent.com/57928993/142996085-7279dd76-af6c-4535-b44a-f0af11ac782e.png) but i think the shape is matching.

![image](https://github.com/umitkaanusta/reddit-detective/assets/57928993/f33b082d-5939-4825-89cd-deb0e2950ffe) New edition expression

`The models are trained with a batch size of 32000 tokens on 8 Tesla V100 GPUs.` In the paper, it is provided about this information. Is it posibble to know...

## What do these changes do? make the code support for Where Operation

feature

There is some work already shown when we use fintuned task, the embedding space will become a cluster. So if we have different information attention, could we use supervised task...

Does it still work now? I have tried, but it didn't