BERTopic icon indicating copy to clipboard operation
BERTopic copied to clipboard

How to report a security issue responsibly?

Open zpbrent opened this issue 1 year ago • 4 comments

Have you searched existing issues? 🔎

  • [X] I have searched and found no existing issues

Desribe the bug

I have just found a potential security issue in the repo and want to know how I can report it to your team privately, thanks!

Reproduction

No response

BERTopic Version

Latest

zpbrent avatar Jan 12 '25 01:01 zpbrent

Hi! I'm good either way, but please feel free to send it to me personally and I'll make sure to take it into account (you can find more info on contact here).

Note that if this relates to serialization, then that's the reason why safetensors was implemented as a result of the inherent issues with code execution that pickle/PyTorch has. If it's something else, I would love to know more about it and see what we can do to resolve it 😄

MaartenGr avatar Jan 17 '25 11:01 MaartenGr

Yes, this is related to a PyTorch Deserialization that can be exploited from HuggingFace's Demo Code remotely, hence enabling attackers abuse your BERTopic to phish over HuggingFace repos. But if you have a plan to remove these PyTorch stuffs and replace with safetensors in future releases, it seems not necessary to send the details. I would like to see the new updates, thanks :-)

zpbrent avatar Jan 17 '25 12:01 zpbrent

Thanks! I'm still a bit undecisive as to what to keep/drop. Currently, you can use pickle, pytorch, and safetensors to save the files (in order of more secure approach).

Although I personally only use safetensors, dropping pytorch would mean that their model is not compatible anymore with any new releases, which would be a shame.

MaartenGr avatar Jan 22 '25 11:01 MaartenGr

Oh, you may not need to drop pickle and pytorch, my finding is just for the unsafe load of pytorch model file and that can be abused by the attackers to phish over the HuggingFace repos, hence making the victims who follow HuggingFace's demo code to fetch and load the BERTopic repo can be RCE. I think you can simply fix it by minimal code change, rather than drop the support of pickle and pytorch. Let me send you the details via e-mail :-)

zpbrent avatar Jan 22 '25 12:01 zpbrent