Petr comments

Results 13 comments of


                                            Petr

Would you like to see a MultiCategorical projection network?

I have a solution for independent `3 + 3 + 2 + 3` logits. I will prepare a pull-request when I have time.

Constexpr redeclaration

I'd like to say that there is more than `constexpr`. `explicit` and `noexcept` are also redefined. ```C++ #ifndef constexpr #define constexpr static const #endif #ifndef explicit #define explicit #endif #ifndef...

Environment resets to different seeds after calling env.seed(0)

Hi @awjuliani Yes, I run this binary. I also figured out that running in headless mode (realtime_mode=False) makes seeds work properly according to UnitySDK.log.

trace.modname does not exist since Python 3.2

I have exactly the same problem while trying to enable "Collect run-time types information for code insight". The PyCharm version is: ``` PyCharm 2016.3.3 Build #PY-163.15188.4, built on March 10,...

Hardcoded block_size in kernels

For block sizes, maybe we should look into `cudaOccupancyMaxPotentialBlockSize`. https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__HIGHLEVEL.html#group__CUDART__HIGHLEVEL_1gee5334618ed4bb0871e4559a77643fc1 This does the occupancy calculator for a given function.

Hardcoded block_size in kernels

@rosslwheeler yes, I had exactly the same issue. Seems like the kernel uses too many registers? Reducing block size to 512 in debug mode makes the code work.

Added FlameGraphs for nsys reports and some nsys documentation

1. Moved the stack collapse script into `dev/tools` 2. Removed `pandas` from `requirements.txt`

NCCL only multi-gpu multi-node training without MPI

I thought OpenMPI supports Slurm: https://docs.open-mpi.org/en/main/launching-apps/slurm.html Can you give some insight on why it didn't work for you?

NCCL only multi-gpu multi-node training without MPI

@chinthysl thank you! Something that we can also do is instead of relying on MPI for single-node is to just spawn processes via `fork()`. Then, for Slurm we could ask...

Allow for a connection to check if it's sending bytes

Fixed some comments. For more fixes, I'd like more input. Waiting for the socket to send all bytes is useful for pretty much all strategies. - `split`, `tlsfrag`: makes it...