guided-diffusion evaluator.py (any pytorch version?)

does anyone know if there is a pytorch version of evaluator.py?

Alternatively, are there other code resources to compute IS, precision and recall?

Mar 28 '25 22:03 forever208

Hello, I am converting the evaluation metric code for ADM into PyTorch format.

$ python evaluator.py VIRTUAL_imagenet256_labeled.npz admnet_guided_upsampled_imagenet256.npz
...
computing reference batch activations...
computing/reading reference batch statistics...
computing sample batch activations...
computing/reading sample batch statistics...
Computing evaluations...
Inception Score: 1.100562334060669
FID: 3.943349109375731
sFID: 299.1543291933718
Precision: 0.82588
Recall: 0.5282

As you can see, there are issues with the sFID and Inception Score of the evaluation results. I myself am not very familiar with the operation of TensorFlow and Inception v3. If you are interested, you are welcome to review this code and suggest modifications.

Apr 12 '25 14:04 lavinal712

I have a same issue with you, by the VAR checkpoint, the rFID =2.70.

Inception Score: 56.86065673828125 FID: 2.70789722116308 sFID: 4.6903826389984715 Precision: 0.74194 Recall: 0.6662

May 03 '25 05:05 sunset-clouds

Upon inspection, I successfully fixed the IS calculation. However, for sFID, I delved deeper into the model and discovered that the features outputted by PyTorch and TensorFlow are significantly different. You can verify this by examining my code. @sunset-clouds

May 03 '25 09:05 lavinal712

@lavinal712 I think we can leave out sFID if necessary since it is not often used in literatures

May 03 '25 10:05 forever208

@forever208 I agree. Before encountering SFID, I had a slight tendency towards OCD, but afterwards I learned to accept imperfection.

May 03 '25 10:05 lavinal712

now, rFID of VAR is okay as follows: Inception Score: 56.86065673828125 FID: 0.9192902702932315 sFID: 4.043749574913136 Precision: 0.99572 Recall: 0.9992

May 08 '25 21:05 sunset-clouds

now, rFID of VAR is okay as follows:现在，VAR 的 rFID 可以，如下所示： Inception Score: 56.86065673828125初始评分： 56.86065673828125 FID: 0.9192902702932315 FID：0.9192902702932315 sFID: 4.043749574913136 SFID：4.043749574913136 Precision: 0.99572 精度：0.99572 Recall: 0.9992 召回率：0.9992

Are you still using ADM for evaluation now?

May 09 '25 05:05 lavinal712

Hello, I am converting the evaluation metric code for ADM into PyTorch format.

lavinal712/ADM-evaluation-suite-pytorch
$ python evaluator.py VIRTUAL_imagenet256_labeled.npz admnet_guided_upsampled_imagenet256.npz
...
computing reference batch activations...
computing/reading reference batch statistics...
computing sample batch activations...
computing/reading sample batch statistics...
Computing evaluations...
Inception Score: 1.100562334060669
FID: 3.943349109375731
sFID: 299.1543291933718
Precision: 0.82588
Recall: 0.5282
As you can see, there are issues with the sFID and Inception Score of the evaluation results. I myself am not very familiar with the operation of TensorFlow and Inception v3. If you are interested, you are welcome to review this code and suggest modifications.

The bug has been fixed, and this code can now fully replace the original version.

Jul 12 '25 07:07 lavinal712