Liqiang NIU comments

Results 12 comments of


                                            Liqiang NIU

Pretraining dataset and code request

Same. Looking forward to the open-source training code and details.

how to do text-image generation

Same question!

Why scale used in Attention is 8 （while dim_head is 64)? If dim or dim_head are changed, should scale be changed automatically?

in attend.py line #123 sim = einsum("b h i d, b h j d -> b h i j", q, k) * self.scale

How to disable flash_attention

@OmkarThawakar Thanks, it's worked! But i got another error like this " attn_output = torch.nn.functional.scaled_dot_product_attention( RuntimeError: The size of tensor a (10) must match the size of tensor b (19)...

Always got nan loss while token_label is False.

@EasonXiao-888 When token_label is setting to False, the loss is always nan. Later i downloaded the token label datasets, but the training is still unstable (at the middle step of...

AttributeError: module 'triton.language' has no attribute 'cumsum'

same issue, triton version==2.1.0, torch=2.0.1, cuda11.6 # File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/mamba_ssm/modules/mamba2.py", line 176, in forward out = mamba_split_conv1d_scan_combined( File "/opt/conda/lib/python3.10/site-packages/mamba_ssm/ops/triton/ssd_combined.py", line 908, in...

AttributeError: module 'triton.language' has no attribute 'cumsum'

torch2.0.1, cuda11.6, triton2.3.0 Triton Error [CUDA]: device kernel image is invalid

Triviaqa metrics wrong!

llm_foundry version is 0.10.0

Liqiang NIU

Pretraining dataset and code request

LFQ loss is negative?

LFQ loss is negative?

how to do text-image generation

Why scale used in Attention is 8 （while dim_head is 64)? If dim or dim_head are changed, should scale be changed automatically?

How to disable flash_attention

Always got nan loss while token_label is False.

AttributeError: module 'triton.language' has no attribute 'cumsum'

AttributeError: module 'triton.language' has no attribute 'cumsum'

Triviaqa metrics wrong!