DeepLearningBenchmarks icon indicating copy to clipboard operation
DeepLearningBenchmarks copied to clipboard

updates on benchmark

Open johnnychen94 opened this issue 6 years ago • 12 comments

It's a lovely benchmark, thanks for the information!

I observed that this benchmark is 7-months old. Perhaps it's quite outdated since both Pytorch and Flux have upgraded a lot.

It would be great if you could run this benchmark again. Though it might not so fair to Flux until Tracker is replaced by Zygote.

Best

johnnychen94 avatar May 12 '19 15:05 johnnychen94

I have been meaning to update these benchmarks, but it would be ideal for Flux to have migrated to Zygote before running these. That being said I can definitely run it on one of the development branches for comparison sake and post those results.

avik-pal avatar May 12 '19 15:05 avik-pal

can you include memory usage somehow? I want to see how good or bad, on memory usage, switching to flux would be on the GPU

im wondering if it will be worth porting this project https://github.com/jwyang/faster-rcnn.pytorch to Flux, in order to fit on my crappy GPU

EMCP avatar May 16 '19 21:05 EMCP

I am not very sure how I can get the memory usage of GPU automatically. But if you have any pointers on how to do that, feel free to submit a PR.

avik-pal avatar May 17 '19 04:05 avik-pal

I generally just do a reading at the beginning of training.. and do a

$ watch nvidia-smi

after 30 seconds or so, the memory has stabilized to some measurement and then I am sure that it will not crash due to running out of headroom.. at least this is what I do in pyTorch

I can look around for automated solutions to this or maybe make a simple one that just parses nvidia-smi

EMCP avatar May 17 '19 05:05 EMCP

@avik-pal @EMCP nvidia-settings -q useddedicatedgpumemory might be easier

tbenst avatar May 17 '19 05:05 tbenst

Here's another vote for ResNet timings with Zygote. :D I really hope there's been progress!

staticfloat avatar Jun 13 '19 18:06 staticfloat

I am just waiting for FluxML/Zygote.jl#198 to be fixed. I have the scripts ready for pytorch 1.0 and flux vgg models.

avik-pal avatar Jun 13 '19 18:06 avik-pal

@tbenst , not sure what I am needing to do , but executing this command you sent was not working..

(base) e@e:~$ nvidia-settings -q useddedicatedgpumemory
Unable to init server: Could not connect: Connection refused
ERROR: The control display is undefined; please run `nvidia-settings --help` for usage information.
(base) e@e:~$ 

EMCP avatar Jul 17 '19 19:07 EMCP

@EMCP hm, looks like it needs X server to be running so not a solution for headless, sorry

tbenst avatar Jul 18 '19 00:07 tbenst

Good to get an update on this. Also, trying out resnet with https://github.com/dhairyagandhi96/Torch.jl might be useful

DhairyaLGandhi avatar Feb 25 '20 06:02 DhairyaLGandhi

Interesting. I will give these a shot on the weekend.

avik-pal avatar Feb 25 '20 17:02 avik-pal

@dhairyagandhi96 I put together a quick benchmark suite for the layers in the update branch. The bson files contain the timings.

avik-pal avatar Feb 26 '20 05:02 avik-pal