SparseTransformer
SparseTransformer copied to clipboard
What does the VarLengthMultiheadSA do to SparseTrTensor input
Hi, the project is very helpful. thank you. I'd like to know what the operations are in the VarLengthMultiheadSA and how to utilize this module to build a network. Could you provide an example? the two projects you mentioned do not use this function to create a transformer-based network. Thank you again.