avalanche
avalanche copied to clipboard
bug for stream-level metrics in periodic eval
periodic eval is broken in some subtle cases. Instead of calling eval on the entire stream, it calls it multiple times for each experience, breaking stream-level metrics.
Reproduce:
from avalanche.benchmarks import SplitMNIST
from avalanche.training import Naive
from avalanche.models import SimpleMLP
if __name__ == '__main__':
benchmark = SplitMNIST(5)
model = SimpleMLP()
strat = Naive(model, None, eval_mb_size=512, eval_every=1)
stream = benchmark.test_stream
strat.train(stream[0], eval_streams=list(stream[:2]))
outputs:
-- >> Start of training phase << --
-- >> Start of eval phase << --
-- Starting eval on experience 0 (Task 0) from test stream --
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:05<00:00, 1.08s/it]
> Eval on experience 0 (Task 0) from test stream ended.
Loss_Exp/eval_phase/test_stream/Task000/Exp000 = 2.3881
Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp000 = 0.0033
-- >> End of eval phase << --
Loss_Stream/eval_phase/test_stream/Task000 = 2.3881
Top1_Acc_Stream/eval_phase/test_stream/Task000 = 0.0033
-- >> Start of eval phase << --
-- Starting eval on experience 1 (Task 0) from test stream --
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00, 2.12it/s]
> Eval on experience 1 (Task 0) from test stream ended.
Loss_Exp/eval_phase/test_stream/Task000/Exp001 = 2.3041
Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp001 = 0.2129
-- >> End of eval phase << --
Loss_Stream/eval_phase/test_stream/Task000 = 2.3041
Top1_Acc_Stream/eval_phase/test_stream/Task000 = 0.2129
which means it's calling eval two times instead of a single one with the entire stream.