Right now, the evaluation happens sequentially, but should be parallel. Need to implement similarly to the link below: run.py
run.py