DART icon indicating copy to clipboard operation
DART copied to clipboard

confused about the usage of BLOCK_FLAG

Open wzsxxa opened this issue 3 years ago • 1 comments

I'm trying to read the code. The 'BLOCK_FLAG' really confused me, I think it is used for calculating the length of prompt, but I'm not sure if values in 'BLOCK_FLAG' are right. Take BoolQPVP as example, I think other than 'passage', 'question', 'self.mask', which are not part of template, the length of rest words in PATTERN should be count. So I think BLOCK_FLAG should be

BLOCK_FLAG = [0, 1, 1, 1, 0, 1, 0 ,1] 

insdead of

BLOCK_FLAG = [0, 0, 1, 0, 0, 0, 0, 0]

Is there something I misunderstood?

class BoolQPVP(PVP):

    VERBALIZER = {
        "False": ["No"],
        "True": ["Yes"]
    }
    """
    VERBALIZER_B = {
        "False": ["false"],
        "True": ["true"]
    }
    """

    PATTERN = ['passage', '.', 'the', ' Question: ',
               'question', '? Answer: ', 'self.mask', '.']

    BLOCK_FLAG = [0, 0, 1, 0, 0, 0, 0, 0]

    def get_parts(self, example: InputExample) -> FilledPattern:
        passage = self.shortenable(example.text_a)
        question = self.shortenable(example.text_b)

        # searched patterns in fully-supervised learning
        # string_list_a = [passage, '.', 'the', 'Question:', question, '?', 'the', 'Answer:', self.mask]
        # string_list_a = [passage, '.', 'the', question, '?', 'the', self.mask]
        # string_list_a = [passage, 'the', question, '?', 'the', self.mask]

        # few-shot
        if self.pattern_id == 1:

            string_list_a = [passage, '.', 'the', ' Question: ',
                             question, '? Answer: ', self.mask, '.']
            string_list_b = []
            block_flag_a = self.BLOCK_FLAG
            block_flag_b = []
            assert len(string_list_a) == len(block_flag_a)
            assert len(string_list_b) == len(block_flag_b)
            return string_list_a, string_list_b, block_flag_a, block_flag_b

        else:
            raise ValueError("unknown pattern_id.")

wzsxxa avatar Sep 23 '22 07:09 wzsxxa

Thanks for your attention! Here the block_flag==1 stands for replaceable tokens and 0 stands for unchanged tokens. In the method of our paper together with the P-Tuning method integrated in our codes, we replace the '1' tokens with special token representations and keep other parts of the input unchanged.

Riroaki avatar Sep 23 '22 11:09 Riroaki