TracIn
TracIn copied to clipboard
Q: Applicability to sequence tagging
Hi @frederick0329, for sequence tagging (e.g. NER) one would need to predict label for each token in the sequence per a test sample. In this case, the loss is averaged across tokens and gradients of the last FFN can still be computed. I have two questions:
- Do you think TracIn computations in this case would require any significant change ?
- Would approximate nearest neighbor be usable in this case ?
- Would fast random approximation (Appendix F) be usable as well ?