lbann
lbann copied to clipboard
Single-node CPU-only performance is not good.
There have been reports of slow performance with default spack builds of LBANN in CPU-only mode. @denfromufa reported it in #1443 and I was able to verify in my local workstation build. For me, running the model_lenet_mnist.prototext model was showing about 75s/epoch with 1 OMP thread, 150s/epoch with 3 OMP threads. These are CUDA-less builds with Aluminum.
I suspect this is just poor optimization. We haven't put much effort into CNNs on CPUs.
That said, our LeNet integration test takes 5 sec/epoch on 2 Catalyst nodes. I wonder what's causing the difference.