Zero attributions for BERT-based sentence similarity model

Open anferico opened this issue 3 years ago • 1 comments

I have a BERT-based sentence similarity model, namely bert-base-nli-mean-tokens which takes a <query, response> pair and outputs a cosine similarity score which is an indication of their semantic similarity. My goal is to compute attributions for the tokens in the response part, so what I did was to follow the tutorial notebook for question answering on SQUAD and adapted it to my use case.

The problem is that when I try to compute attributions for the tokens in the response part, all I get is a bunch of zeros. For reference:

The baseline input is just like the response input with the difference that non-special tokens are replaced with [PAD] tokens
The attention mask used for the baseline input is the same as the one used for the response input

Does anybody have an idea of why this might happen? Here's the notebook I wrote: https://colab.research.google.com/drive/1XB37s3hC8Ugr9ZeaFkMZ3L2AJIlQuTF1?usp=sharing

Feb 22 '22 09:02 anferico

@anferico, the problem is that in your case (inputs - baseline) is zero for word embedding layer. This is probably because the same additional_forward_args are used both for baselines and inputs.

As a workaround you can probably pass different additional_forward_args for baseline in the predict function. We don't have a way to pass different additional_forward_args for baseline and inputs in Captum.

You can easily debug it with:

la = LayerActivation(predict, my_model.embeddings.word_embeddings)
la_inp  = la.attribute(response_input_ids, additional_forward_args=(
        query_input_ids,
        pooling_mode,
        attention_mask_for_responses,
        attention_mask_for_queries
    ))
la_baseline = la.attribute(baseline_input_ids, additional_forward_args=(
        query_input_ids,
        pooling_mode,
        attention_mask_for_responses,
        attention_mask_for_queries
    ))

You'll see that la_inp = la_baseline.

You can also avoid by multiplying with inputs - baselines with flag by setting: multiply_by_inputs=False.

lig = LayerIntegratedGradients(predict, my_model.embeddings.word_embeddings, multiply_by_inputs=False)

Mar 09 '22 18:03 NarineK