FAMO
FAMO copied to clipboard
Question on intuition behind choice of -21 for min_losses
Hello,
In the toy example here you have set self.min_losses = torch.Tensor([-21 * scale, -21]) where scale = 1e-1 by default. If I understand correctly, this is meant to replicate a scenario where the task losses have a scale imbalance during training, by always multiplying the first task's loss by the scale parameter.
However I was wondering why the scale for the first loss-pair was chosen as [-21 * scale, -21]--why -21 as opposed to a positive number or just 1? Is it because of the Toy problem's loss function, which from glancing at the code
f1_sq = ((-x1 + 7).pow(2) + 0.1 * (-x2 - 8).pow(2)) / 10 - 20
potentially calculates a negative loss value, or is there a deeper reason for it?
Thank you!