How to report a security issue responsibly?
Have you searched existing issues? 🔎
- [X] I have searched and found no existing issues
Desribe the bug
I have just found a potential security issue in the repo and want to know how I can report it to your team privately, thanks!
Reproduction
No response
BERTopic Version
Latest
Hi! I'm good either way, but please feel free to send it to me personally and I'll make sure to take it into account (you can find more info on contact here).
Note that if this relates to serialization, then that's the reason why safetensors was implemented as a result of the inherent issues with code execution that pickle/PyTorch has. If it's something else, I would love to know more about it and see what we can do to resolve it 😄
Yes, this is related to a PyTorch Deserialization that can be exploited from HuggingFace's Demo Code remotely, hence enabling attackers abuse your BERTopic to phish over HuggingFace repos. But if you have a plan to remove these PyTorch stuffs and replace with safetensors in future releases, it seems not necessary to send the details. I would like to see the new updates, thanks :-)
Thanks! I'm still a bit undecisive as to what to keep/drop. Currently, you can use pickle, pytorch, and safetensors to save the files (in order of more secure approach).
Although I personally only use safetensors, dropping pytorch would mean that their model is not compatible anymore with any new releases, which would be a shame.
Oh, you may not need to drop pickle and pytorch, my finding is just for the unsafe load of pytorch model file and that can be abused by the attackers to phish over the HuggingFace repos, hence making the victims who follow HuggingFace's demo code to fetch and load the BERTopic repo can be RCE. I think you can simply fix it by minimal code change, rather than drop the support of pickle and pytorch. Let me send you the details via e-mail :-)