Marc Masana

Results 17 comments of Marc Masana

@btwardow I forgot to add a test for gdumb, can you add it? I'll check the comment you left and see if I can improve it

Hi Mengya, **Q1**: I'm not sure if I totally understood your question. The `model_old` should always have 1 less head than the current model, since it has no information about...

I would need more context to solve this issue. Maybe you have a fixed memory size which is smaller than the total amount of classes? That could be an option.

Hi! Could you share the arguments that you used for your setting? That would make it easier to figure out the discrepancy. Maybe you didn't run the experiment with the...

I'm not sure if I understand the question, but the accuracy when learning each task is only calculated from the samples of classes belonging to that task. During training, the...

Hi @fszatkowski, I remember discussing about this approach and the gradients/loss before, so maybe we missed something. A first change that has not been pushed yet into main is the...

I see, the way `.detach()` is called, could indeed block the gradients from updating. I'll first try to reproduce what you propose with the `--gamma` parameter to check it out.

You are correct, it seems like that loss is not having an effect indeed. There are no gradients updated, and therefore changing the parameter has no effect and brings the...

Hi @jmin0530 , The joint training is an incremental one, meaning that the network goes through a training session at each task, with access to all data from previous tasks....