Parsifal

Results 6 comments of Parsifal

I noticed that: setting the track_ running_state to True will record the mean and variance of new task data, which are applied to the inference of the old task, thereby...

> I upload models to google driver as an alternative https://drive.google.com/drive/folders/1USEdy_7uvwO4PIqsQJq8kT0sX4H4f7nn?usp=drive_link still down, thanks to your share

looking forward to the training code, too!

> It seems a multi-task/head model. A slightly more complex approach is to split the model into backbone + num_heads, runtime dyanmic select the head module. Yes, this is a...

Hello, @lix19937 ! I have illustrated a general convolutional layer and my multi-task convolutional layer, as shown in the figure. It is evident that the conventional convolutional layer on the...

> Ok, after some investigation I found out that, at least in my case, gradient clipping is happening, but transformers trainer is logging the wrong thing. > > In transformers'...