guided-diffusion icon indicating copy to clipboard operation
guided-diffusion copied to clipboard

evaluator.py (any pytorch version?)

Open forever208 opened this issue 10 months ago • 8 comments

does anyone know if there is a pytorch version of evaluator.py?

Alternatively, are there other code resources to compute IS, precision and recall?

forever208 avatar Mar 28 '25 22:03 forever208

Hello, I am converting the evaluation metric code for ADM into PyTorch format.

lavinal712/ADM-evaluation-suite-pytorch

$ python evaluator.py VIRTUAL_imagenet256_labeled.npz admnet_guided_upsampled_imagenet256.npz
...
computing reference batch activations...
computing/reading reference batch statistics...
computing sample batch activations...
computing/reading sample batch statistics...
Computing evaluations...
Inception Score: 1.100562334060669
FID: 3.943349109375731
sFID: 299.1543291933718
Precision: 0.82588
Recall: 0.5282

As you can see, there are issues with the sFID and Inception Score of the evaluation results. I myself am not very familiar with the operation of TensorFlow and Inception v3. If you are interested, you are welcome to review this code and suggest modifications.

lavinal712 avatar Apr 12 '25 14:04 lavinal712

I have a same issue with you, by the VAR checkpoint, the rFID =2.70.

Inception Score: 56.86065673828125 FID: 2.70789722116308 sFID: 4.6903826389984715 Precision: 0.74194 Recall: 0.6662

sunset-clouds avatar May 03 '25 05:05 sunset-clouds

Upon inspection, I successfully fixed the IS calculation. However, for sFID, I delved deeper into the model and discovered that the features outputted by PyTorch and TensorFlow are significantly different. You can verify this by examining my code. @sunset-clouds

lavinal712 avatar May 03 '25 09:05 lavinal712

@lavinal712 I think we can leave out sFID if necessary since it is not often used in literatures

forever208 avatar May 03 '25 10:05 forever208

@forever208 I agree. Before encountering SFID, I had a slight tendency towards OCD, but afterwards I learned to accept imperfection.

lavinal712 avatar May 03 '25 10:05 lavinal712

now, rFID of VAR is okay as follows: Inception Score: 56.86065673828125 FID: 0.9192902702932315 sFID: 4.043749574913136 Precision: 0.99572 Recall: 0.9992

sunset-clouds avatar May 08 '25 21:05 sunset-clouds

now, rFID of VAR is okay as follows:现在,VAR 的 rFID 可以,如下所示: Inception Score: 56.86065673828125初始评分: 56.86065673828125 FID: 0.9192902702932315  FID:0.9192902702932315 sFID: 4.043749574913136  SFID:4.043749574913136 Precision: 0.99572  精度:0.99572 Recall: 0.9992  召回率:0.9992

Are you still using ADM for evaluation now?

lavinal712 avatar May 09 '25 05:05 lavinal712

Hello, I am converting the evaluation metric code for ADM into PyTorch format.

lavinal712/ADM-evaluation-suite-pytorch

$ python evaluator.py VIRTUAL_imagenet256_labeled.npz admnet_guided_upsampled_imagenet256.npz
...
computing reference batch activations...
computing/reading reference batch statistics...
computing sample batch activations...
computing/reading sample batch statistics...
Computing evaluations...
Inception Score: 1.100562334060669
FID: 3.943349109375731
sFID: 299.1543291933718
Precision: 0.82588
Recall: 0.5282

As you can see, there are issues with the sFID and Inception Score of the evaluation results. I myself am not very familiar with the operation of TensorFlow and Inception v3. If you are interested, you are welcome to review this code and suggest modifications.

The bug has been fixed, and this code can now fully replace the original version.

lavinal712 avatar Jul 12 '25 07:07 lavinal712