Mutian He comments

Results 10 comments of


                                            Mutian He

more detail guide to get start?

Maybe the explanation of arguments for running the examples? eg. the format and specifications of train_file, word_file and eval_file, and the meaning of num_classes. Thanks.

problem with textClassifierHATT.py, thank you

Try to use keras1.2.2 rather than keras2

problem with textClassifierHATT.py, thank you

(Simply use keras1.2. The problem is due to incompatibility.)

Preprocessed units for the segments

(Or do you have any number for the performance of DUAL with the view only on each segment?)

Preprocessed units for the segments

Thank you very much! Let me have a look at them.

Preprocessed units for the segments

Thank you very much! Actually I am looking for the HuBERT units for each segment (for, e.g., context-0_0_1, context-0_0_2, ...), while it seems that the provided units above and in...

Pretrained Models for generator no tokenizer files

Ah, it is simply the standard GPT2 tokenizer on Huggingface transformers.

[Bug] RoPE attention encounters illegal memory for long sequence decoding

I might have also encountered this problem. From my test it happens at the 2nd transformer layer during decoding with H100 GPUs, length >= 256, and with negative `seqlen_offset` values...

[Feature Request] Cached inference for native sparse attention

I'm actually using the NSA kernel recently and hence working on fixing this...I can try to get this done in a few days BTW @Espere-1119-Song from what I understand it...

[RFC] Support chunk prefilling for all kernels

BTW there are also some places outside the kernels that involve the issue, for example https://github.com/fla-org/flash-linear-attention/blob/364c199e65b3247efec2eb4b10067152bb3a8f1a/fla/layers/utils.py#L150-L161