transformers `SequenceBiasLogitsProcessor` parameter cannot be specified in the the json config file

System Info

General issue

Who can help?

@zucchini-nlp @gante

[x] The official example scripts
[ ] My own modified scripts

Tasks

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

SequenceBiasLogitsProcessor's input argument parameter sequence_bias (Dict[Tuple[int], float]) can not be specified in the generation_config.json.

This is because the dictionary with tuple as keys are not supported in json (not json serializable).

Expected behavior

We should be able to specify the sequence_bias parameter in the config file Iike any other generation config parameters.

One way to support this is by changing the input argument type form Dict[Tuple[int], float] to List[Tuple[int], float]

Aug 04 '24 23:08 atawari

@zucchini-nlp @gante please let me know if I am wrong. But if this is indeed an issue, I am happy to help fix it by changing the input argument type form Dict[Tuple[int], float] to List[[Tuple[int], float]]

Aug 05 '24 20:08 atawari

Hey @atawari !

Sorry, missed this issue yesterday! We're currently doing major refactoring of generate() (track https://github.com/huggingface/transformers/issues/30810), so we're trying to limit the number of generate-related PRs/new features. In general, I'm okey with this change, also there are some other kwargs in the config that are not serializable. But let's wait for @gante to confirm if we want a fix for that now or after the first step of refactoring is done :)

Aug 06 '24 05:08 zucchini-nlp

Hi @atawari 👋

Your proposed change makes sense :) Being able to serialize all options will become increasingly important in transformers, so this needs to be changed

A few requests if you open a PR:

make sure the old input format (Dict[Tuple[int], float]) is still accepted by the processor, to avoid breaking backwards compatibility
Add a test case to confirm we a) can serialize the new format b) we can initialize an instance of SequenceBiasLogitsProcessor using that same config, in this file
If you're feeling extra bold, in a subsequent PR, adding a test like the one I described in 2 but for all logits processors (one test per processor) would be greatly appreciated 💛

Aug 06 '24 14:08 gante

Hello! Is anybody working on this bug? If not, then I could try to fix it

Aug 26 '24 16:08 VladOS95-cyber

Since not PR is linked, feel free to try!

Aug 27 '24 13:08 ArthurZucker

@gante @ArthurZucker Hello, I am not quite sure that i understood the idea of this problem and required fix. I tried to reproduce this bug. Yes, we definetely cannot specify format Dict[Tuple[int], float] in .json file, only if we have string instead of tuple. And actually when we use GenerationConfig.save_pretrained() and GenerationConfig.from_pretrained() it converts ( {(1,): 100.0, (4,): 100.0} as example) into {"(1,)":100.0, "(4,)": 100.0 } and back successfully. I just don't see the difference with using List[[Tuple[int], float]], we still cannot specify tuple inside the list in json file, can't we? It does not matter if it is wrapped into dict or list. I am a little bit confused by that or I did not fully understand this issue.

Aug 28 '24 17:08 VladOS95-cyber

@VladOS95-cyber indeed there is no tuple serialization in json. List[List[List[int], float]] would probably be a better format! Using strings is usually not a good idea, as seemingly innocuous characters like a space can break the logic.

In other words, a list where each member is a list of integers and a float. E.g. [[[45, 67], -0.6], [[89], 1.2]] to apply a bias of -0.6 to the sequence [45, 67] and a bias of 1.2 to the token 89.

Sep 06 '24 09:09 gante

@VladOS95-cyber indeed there is no tuple serialization in json. List[List[List[int], float]] would probably be a better format! Using strings is usually not a good idea, as seemingly innocuous characters like a space can break the logic.

In other words, a list where each member is a list of integers and a float. E.g. [[[45, 67], -0.6], [[89], 1.2]] to apply a bias of -0.6 to the sequence [45, 67] and a bias of 1.2 to the token 89.

Hi @gante, thanks for the explanation! I'll see what I could do about it.

Sep 06 '24 12:09 VladOS95-cyber

was fixed by https://github.com/huggingface/transformers/pull/33375, closing

Nov 29 '24 10:11 zucchini-nlp