Bruno Casella

Results 35 comments of Bruno Casella

Yes, I will run the experiment for 1000 epochs. In the meantime, I am running the same experiment for 200 epochs again in order to check if it was a...

Hello everyone. I have just completed the 3 runs for 1000 epochs. Besides time, I have also collected current and peak memory for each epoch using `tracemalloc`. I followed this...

> Well, looks like `tracemalloc` shows incorrect data for Gramine. Or maybe it shows incorrect data for PyTorch in general? Please check what is shown for normal training... Sure, I...

Here are the current and peak memory for normal training:

Yes @dimakuv , this morning I have already started this experiment: `sudo perf record --call-graph dwarf -F 50 -e cpu-clock gramine-direct ./pytorch mnist.py`. I will update you when it will...

> Also, one thing which seems suspicious to me: why the running time is so noisy without Gramine but then gets very stable with it, both direct and SGX? It...

@monavij Tomorrow I will run the experiment with `sgx.preheat_enclave = true`. However, according to the documentation, `Using this option makes sense only if the whole enclave memory fits into [EPC](https://gramine.readthedocs.io/en/stable/sgx-intro.html#term-epc)...

In my manifest there are these options: ``` loader.pal_internal_mem_size = "128M" sgx.enclave_size = "4G" sgx.max_threads = 32 sgx.edmm_enable = {{ 'true' if env.get('EDMM', '0') == '1' else 'false' }} ```...

@dimakuv I was running for 1000 epochs the experiment with `gramine-direct` recording metrics with perf. However, the generated file `perf.data` was over 120GB... and training stopped because there was no...

Ok @dimakuv > 1. The performance plots for the experiment with 1000 epochs (no perf enabled) This is already posted > 2. The performance analysis for the experiment with e.g....