Medusa
Medusa copied to clipboard
About the Tree Sparsity
I noticed that the Medusa model uses a pre-defined parse tree structure for inference, and different models have their corresponding tree structures. What is the detailed procedure for defining this mask? Moreover, if I enhanced the decoding capability of the medusa heads, how could I adaptively adjust the tree structure? Is there any possible algorithms or formulations?