Petr
Petr
I have a solution for independent `3 + 3 + 2 + 3` logits. I will prepare a pull-request when I have time.
I'd like to say that there is more than `constexpr`. `explicit` and `noexcept` are also redefined. ```C++ #ifndef constexpr #define constexpr static const #endif #ifndef explicit #define explicit #endif #ifndef...
Hi @awjuliani Yes, I run this binary. I also figured out that running in headless mode (realtime_mode=False) makes seeds work properly according to UnitySDK.log.
I have exactly the same problem while trying to enable "Collect run-time types information for code insight". The PyCharm version is: ``` PyCharm 2016.3.3 Build #PY-163.15188.4, built on March 10,...
For block sizes, maybe we should look into `cudaOccupancyMaxPotentialBlockSize`. https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__HIGHLEVEL.html#group__CUDART__HIGHLEVEL_1gee5334618ed4bb0871e4559a77643fc1 This does the occupancy calculator for a given function.
@rosslwheeler yes, I had exactly the same issue. Seems like the kernel uses too many registers? Reducing block size to 512 in debug mode makes the code work.
1. Moved the stack collapse script into `dev/tools` 2. Removed `pandas` from `requirements.txt`
I thought OpenMPI supports Slurm: https://docs.open-mpi.org/en/main/launching-apps/slurm.html Can you give some insight on why it didn't work for you?
@chinthysl thank you! Something that we can also do is instead of relying on MPI for single-node is to just spawn processes via `fork()`. Then, for Slurm we could ask...
Fixed some comments. For more fixes, I'd like more input. Waiting for the socket to send all bytes is useful for pretty much all strategies. - `split`, `tlsfrag`: makes it...