mlpack Embedding layer

Aug 31 '25 19:08 kumarutkarsh1248

Thanks for adapting this layer @kumarutkarsh1248! I have a bunch of comments---let me know what you think. As implemented, does this work in the mlpack-onnx networks you have put it into?

Actually, I couldn’t test this layer on onnx-mlack because I need one more layer: positional encoding There is another way to test this embedding layer on onnx-mlpack, but I’d rather adapt the positional encoding layer and then test both together on onnx-mlpack.

Sep 04 '25 06:09 kumarutkarsh1248

Thanks for adapting this layer @kumarutkarsh1248! I have a bunch of comments---let me know what you think. As implemented, does this work in the mlpack-onnx networks you have put it into?

Actually, I couldn’t test this layer on onnx-mlack because I need one more layer: positional encoding There is another way to test this embedding layer on onnx-mlpack, but I’d rather adapt the positional encoding layer and then test both together on onnx-mlpack.

Sounds good. Let's just get the gradient test working here, and then we can merge it. If there are other issues later when you use it with onnx-mlpack, we can address those then. :+1:

Sep 04 '25 16:09 rcurtin

I handled the review comments. @kumarutkarsh1248 do you have time to take a look at this and make sure that you agree with how I changed things? Especially the output shape of the Embedding layer---I tried to make sure it was the same shape as the multihead attention layer requires.

Nov 08 '25 02:11 rcurtin

I handled the review comments. @kumarutkarsh1248 do you have time to take a look at this and make sure that you agree with how I changed things? Especially the output shape of the Embedding layer---I tried to make sure it was the same shape as the multihead attention layer requires.

Sorry for the slow response and thanks for doing my part of work The output shape of embedding layer looks correct, this is exactly how the mlltiHeadAttention expects the input to be. I want to test this with onnx-mlpack but don't have enough time right now. I'll test it next month and I think its better to have some test cases for ( embedding layer and multiHeadAttention layer ) combined as embeeding layer is almost always used just before multiHeadAttention. I’ll handle that part next month.

Nov 11 '25 07:11 kumarutkarsh1248

Not a bad idea, I added a simple test case including MultiheadAttention in 9f6a13a906; @kumarutkarsh1248 let me know what you think.

Nov 17 '25 21:11 rcurtin

Thats perfect

Nov 18 '25 16:11 kumarutkarsh1248