Alexey Gerasev
Alexey Gerasev
Hi, thank you for being interested in this crate! Honestly speaking, the performance wasn't the main goal of the crate and I haven't focused on it heavily. I created that...
Thank you for the pointing out to this possible issue. To be honest, I haven't considered it. But does the current implementation is really subjected to the false sharing? Let...
@crsaracco, @mgeier, thank you for your interest! And sorry for such late answer. I agree that the false sharing may be the one of the issues. I've tried to fix...
This issue seems to have been fixed in #46 in exactly this way: > One solution would be to directly implement `give` and `take`, per the FreeRTOS API, which does...
Hi! Thank you for finding and reporting such a subtle issue! I've made a [fast fix](https://github.com/agerasev/ringbuf/commit/dc694c2fa8bbda295f8407da01250dee3703be90) by initializing destination memory before calling `read`. But this could result in a huge...
Hi! Unfortunately, I haven't created such aliases for `LocalRb` as it rarely used, in my opinion. But you can write the full type of local producer by yourself: + For...
My system: ``` $ uname -a Linux 6.1.0-13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29) x86_64 GNU/Linux ``` ``` $ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022...
> Do you know where in candle it's coming from? It can occur in many places in [candle_core::cuda_backend](https://github.com/huggingface/candle/blob/main/candle-core/src/cuda_backend.rs) where `alloc` or `htod_copy` called. There is no checks for zero length...
> Can you print out the CudaDevice in your example? ```rust CudaDevice { cu_device: 0, cu_primary_ctx: 0x000055759b945ec0, stream: 0x0000000000000000, event: 0x000055759bc8d4f0, modules: RwLock { data: {}, poisoned: false, .. },...
I'm trying to implement some kind of [checkpointing](https://pytorch.org/docs/stable/checkpoint.html) by splitting the whole network by segments and running forward and backward pass separately for each segment to avoid storing all activations...