rfcs: graph api: support swish operation
@mgouicem Thanks for the review. I just incorporate the comments into the RFC and leave the conclusion section open for discussion.
I would actually advocate for using a composition of simpler ops (option 1) ...
Thanks. I should have mentioned that we already have this composition for swish in our code (link). It indeed worked for some requests from frameworks. But we also see some issues as I mentioned in the cons of option 1.
As an example, just search swish in HuggingFace github.
Thanks for the link. I copied the link to the RFC also. :) For these cases, if framework developers want to optimize it with oneDNN, they will have to detect the pattern, rewrite it with PyTorch's SiLU operation, and then call oneDNN to optimize the SiLU operation, if we don't consider further fusion into other operations like matmul or conv. So it still becomes if we want to use two or more operations to optimize SiLU or just one dedicated swish operation.
There are often "equivalent" formulas that are used in the wild.
Yes, this is really a good point. Using composition of smaller operations will give us the flexibility to support more variants, without breaking or adding API. I also added this into the proposals.
@mgouicem Could you please take another look on this? The conclusion part has been updated in the last commit (I will squash them once ready for merge). Thank you!