LTP issues

Cannot run with or without installing transformer.

1

Hi, I found I cannot run the learned token pruning whether I install the transformers or not. If I install transformer, it will raise the error of `ValueError: Some specified...

KevinHooah

Some specified arguments are not used by the HfArgumentParser

5

when I run `python run.py --arch ltp-base --task SST2 --restore pretrained/bert-base-uncased-SST-2 --lr 2e-5 --temperature 2e-3 --lambda_threshold 0.1 --weight_decay 0 --bs 64 --masking_mode soft --epoch 10 --save_step 100 --no_load` Some specified...

linxid

Does attention mask reduce computation cost？

2

Hey, there. After I read the code, I am confused that the computation cost can be reduced by mask more tokens. Did I miss anything? PS. I see the FLOPS...

brotherb

Inference about hard pruning

When hard pruning inference, tokens below the threshold will be discarded and do not enter the calculation of the feed-forward layer, but when entering the feed-forward layer after normalization and...

cynthia0114

No mask used in evaluation process

Can you show that how you evaluate the model performance with 'attention_mask' ? according to this line: https://github.com/kssteven418/LTP/blob/8ab31a623fb71c5f4f8208e878097f214484e848/src/transformers/models/ltp/modeling_ltp.py#L305C27-L305C27 the 'attention_mask' is never used outside the for loop. So, I think...

shawnricecake

Why don't mask during Testing?

https://github.com/kssteven418/LTP/blob/f1d5ec88aba913de5e2b4aa502af9cf0ab7bb13f/src/transformers/models/ltp/modeling_ltp.py#L247 if self.training and not self.hard_masking: if pruner_outputs is not None: threshold, pruning_scores = pruner_outputs['threshold'], pruner_outputs['scores'] self.mask = torch.sigmoid((pruning_scores - threshold) / self.temperature) layer_output = layer_output * self.mask.unsqueeze(-1)

sev777

Where to get the pretrained model with max-seq-length over 512?

4

I am trying to train a ltp model tackling long document, but where can I get the pretrained model with max-seq-length over 512? As far as I know, pretrained models...

yhy-2000

question about the max seq length

2

# 🖥 Benchmarking `transformers` Hi there, When I run one of the examples in the text classification folder, and pass max_seq_length =1024 to the model, I got the following warning,...

XueqiYang

FLOPs

2

Since it is a dynamic transformer, the GFLOPs of each instance input is different. How to calculate the FLOPs of the entire model? Take the average FLOPs of all validation...

Cydia2018

will token number becom larger when fix threshold (hard training step)?

it seems that the model will tend to make the token number larger when fix threshold (hard training step) because it cannot take L1 loss into account. How to solve...

DreamsofGg

LTP
LTP copied to clipboard

Metadata

Cannot run with or without installing transformer.

Some specified arguments are not used by the HfArgumentParser

Does attention mask reduce computation cost？

Inference about hard pruning

No mask used in evaluation process

Why don't mask during Testing?

Where to get the pretrained model with max-seq-length over 512?

question about the max seq length

FLOPs

will token number becom larger when fix threshold (hard training step)?

← Metadata

Owner

Metadata

LTP LTP copied to clipboard

Metadata

← Metadata

Owner

Metadata

LTP
LTP copied to clipboard