plopresti
plopresti
**System information** - OS Platform and Distribution (e.g., Linux Ubuntu 20.04): Alma Linux 9.3 - TensorFlow version and how it was installed (source or binary): r2.16 branch from source -...
### Describe the bug I am not sure this is a UCX bug. Hopefully someone can give me ideas about next steps. I have two identical servers with BCM57414 hardware....
I am running datacenter-gpu-manager-2.4.6 from https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/ My O/S is AlmaLinux 8.6, CUDA 11.7.1. GPUs are 2x A100. When I start the nvidia-dcgm service and run "dcgmi diag -r 2", the...
Fixes https://forums.developer.nvidia.com/t/290774
Remove ambiguous inherited constructor in default_quant_params.cc. GCC complains about this (https://stackoverflow.com/q/79553477/). Fix is trivial and harmless. Fixes #84977.
https://apptainer.org/docs/user/main/cgroups.html#cpu-limits says: > --cpuset-mems specifies a list of memory nodes the container can use. It should generally be set to the same value as --cpuset-cpus. This is not correct. --cpuset-cpus...
``` $ ./contrib/rpm/build.sh make: go: No such file or directory make: go: No such file or directory make: go: No such file or directory ``` [etc.]
This is somewhat related to #2830 and #3186. Apptainer 1.4.3, non-suid. All commands run under Rocky 8 as a non-root user. First attempt, definition file `test1.def`: ``` Bootstrap: scratch %arguments...
(Note: This is closely related to, but distinct from, #2830) Apptainer 1.4.3, non-suid. Step 1: `sudo apptainer build rocky8.sif docker://rockylinux:8` As described in #2830, running as root is currently the...