Karim Foda comments

Results 44 comments of


                                            Karim Foda

Add stop sequence to text generation pipeline

Hey @Narsil. I've managed to get this working for greedy decoding and multimodal sampling. For beam-search, what would be the best approach to deal with a stop_sequence? I've assumed that...

Add stop sequence to text generation pipeline

Thanks @Narsil @gante. Okay so for the sake of deploying iteratively I've removed the `eos_token_id` from the `StoppingCriteria` and will add it as a separate PR. I've added a test...

Add stop sequence to text generation pipeline

> We should implement `stop_sequence` only once (probably in `generate`) but we could have 2 tests if you want to test the full pipeline too. (Probably in `tests/pipelines/test_pipelines_text_generation.py` for instance.)...

Add stop sequence to text generation pipeline

No problem I've just moved the stop_sequence back to the pipeline function and added the tests you requested in the `tests/pipelines/test_pipelines_text_generation.py` folder. This should make this PR ready for review...

Flax Remat for LongT5

Hey @sanchit-gandhi. Sorry this is taking so long, adding your changes was relatively easy but I'm a bit stuck trying to pass a few failing tests which I believe are...

Flax Remat for LongT5

of course that makes sense. Apologies for the misunderstanding. I'll work on the gradient checkpointing part using your suggestions and remove the `key_value_states` for this PR.

Flax Remat for LongT5

hey @sanchit-gandhi. I beleive this is now ready for review. The PR passes all the tests except ones related to inconsistencies between t5 and long_t5. If you're happy with this...

Flax Remat for LongT5

Thanks @sanchit-gandhi for all the helpful comments. Addressed them all and ran `make fix-copies`. Hopefully these changes should be reflected properly for `LongT5 `as well now.

Flax Remat for LongT5

Amazing, will keep you posted. Thanks for all the help getting this merged!

Add hallucination filter in generate()

Hi @gante. Apologies I don't think I've properly clarified the use case I think this could solve. I unfortunately can't share my model here or my dataset (but happy to...