Franz Louis Cesista issues

Results 11 issues of


                                            Franz Louis Cesista

typos

Halo! There are typos at lines 59 and 114. I'm not sure if I'm right with the latter tho. I don't know how attention mechanisms work -- I just followed...

Speedup `attention_forward_kernel2` by implementing Flash Attention 2 kernel

This speeds up the `attention_forward_kernel2` kernel by replacing the implementation with a minimal Flash Attention 2 kernel as can be found in https://github.com/leloykun/flash-hyperbolic-attention-minimal/blob/main/flash_attention_2.cu Benchmark results on an A100 (80GB) Attention...

Utility Repeater not running

The utility repeater is stuck here: ``` Repeater Tool for Russian AI Cup By Russian AI Cup Team [Mon Dec 31 12:58:32 PHT 2018]: Repeater has been started [token=e1fab727820078842783c6eb76f15c9670b389b2_1] [....]...

Adds option for JSON schema optimization

Pydantic's `.model_json_schema()` and `get_schema_from_signature` don't actually make optional fields/arguments optional in the json schema. This forces the model to output the keys even when the values are `null` anyway--slowing down...

Auto-apply chat template in `SequenceGenerator` and `SequenceGeneratorAdapter`, if available

This PR auto-applies chat templates by default when using instruct/chat models. Doesn't support LlamaCPP for now tho. --- ### Why? Instruct/Chat models tend to be annoyingly template dependent (i.e. they...

enhancement

Auto model & pipeline for image-text-to-image-text models

### Feature request This is a tracker issue for work on _interleaved_ in-and-out image-text generation. There are now >= 4 open-source models that can do _interleaved_ image-text generation--and many more...

Feature request

Fix regression on `Processor.save_pretrained` caused by #31691

# What does this PR do? Fix regression on `Processor.save_pretrained` caused by https://github.com/huggingface/transformers/pull/31691 tl;dr: a month ago, we made a change that removed `"chat_template"` from `processor_dict` when saving a processor....

Uniform kwargs for processors of audio-text models

# What does this PR do? - Uniformizes kwargs for processors of audio-text models. - An extension of https://github.com/huggingface/transformers/issues/31911 - NOTE: don't review nor merge until this PR is complete:...

Implement backward pass

# Description This PR implements a minimal backward pass for flash attention. I got these results on my RTX 2060 ``` === profiling manual attention (backward pass) === ... Self...

[Potential Record] Value Embed Unet - 4.11 mins

![image](https://github.com/user-attachments/assets/2576a90e-f47a-4ff1-8103-30a7952cb077) ![image](https://github.com/user-attachments/assets/5250b642-5c76-42aa-ab95-bfb415a1d9a6) ## ChangeLog * **Added UNet connectivity structure on the value embeddings**. This allowed us to reduce the number of value embeddings from 12 to 6 and the total...