Dan Fu

Results 103 comments of Dan Fu

Just popping in quickly here, Flux has head dimension 128 :) On Fri, Aug 23, 2024 at 10:49 PM Philip Turner ***@***.***> wrote: > Head dimensions are hard-coded. Just like...

Ah, this is because it’s breaking the actual indexing of GPU memory in the kernel - it probably breaks at 131k * 4 * 4k = 2^31. It should be...

The initialization size sets the size of the FFT. If the size is the same as the input, it will compute a circular convolution, see this, the “wrap it around”...

These are correct, we are cleaning up the code/algorithms for the fast block FFT and three pass algorithm across into a single package, this repository is focused on the architecture...

Hopefully soon! I’ve been traveling for a bit, but have some time to code again soon. On Tue, Mar 14, 2023 at 8:39 AM Lee Seung Yul ***@***.***> wrote: >...

Thanks for this PR! We'll take a look and see if it makes sense to merge in.

We have a new kernel coming out very soon that obviates that algorithm and scales up to 4M - will post here when we release it!

Ack’ing that I’ve seen this, will try to get it in this week or next! On Fri, Nov 24, 2023 at 3:07 AM FeelingFatigued ***@***.***> wrote: > I noticed the...

Looks like a bug - feel free to look through the outputs and file a PR to fix it if you have the chance. We are (slowly) working to rewrite...

That looks like a bug. That code is only used for LRA, so it might affect some of those results. I don’t believe it’s used anywhere else. On Thu, Sep...