stablediffusion icon indicating copy to clipboard operation
stablediffusion copied to clipboard

Guarantee that images containing a Stable Diffusion watermark will not be included in training data

Open 12joan opened this issue 3 years ago • 1 comments

Background

As per my understanding, the purpose of including an invisible watermark in generated images is to ensure that these images will not be included in the training data of future iterations of Stable Diffusion or other models, since this would be detrimental to the training process.

Recently, a project called NoAI has sprung up which allows artists to add a Stable Diffusion watermark to their artworks despite those artworks not being generated by Stable Diffusion. The intention of this is to prevent these artworks from being included in future Stable Diffusion training data because the training scripts will incorrectly conclude that the image is AI-generated.

As such, use of NoAI on an artwork constitutes an intentional and explicit opt-out of that artwork being used to train models like Stable Diffusion. Regardless of your thoughts on the ethicality of opt-in by default training data sets, I hope we can all agree that to intentionally ignore an explicit opt-out request and to use an artwork against an artist's expressed wishes is to commit an injustice against that artist. Honouring opt-out requests such as and including these, I believe, would be a reasonable compromise in exchange for artists being more accepting of AI image generators in general, as per Scanlon's theory of contractualism.

Question / Feature request

It's not clear to me from looking at the Stable Diffusion source code whether images bearing a Stable Diffusion watermark are already being excluded from training data. Can someone familiar with the training process weigh in on whether this is currently happening?

If watermarked images are not presently being excluded, I would strongly encourage the Stable Diffusion team implement this behaviour, not only for the reasons for which the invisible watermark was originally conceived, but also for the reasons I have outlined above.

12joan avatar Jan 12 '23 10:01 12joan

I'm very unfamiliar with Stable Diffusion and this community, but you said that it should not train with it's own generated outputs, but what about training the model what to NOT do, like adding tokens for "bad hands", "deformed body", "extra limbs", "missing limbs", "extremely cursed". I think it's a good idea to use watermark to avoid reinforcing wrong biases in the model, but as I see a lot of dreambooth models are trained in "what to not do" almost as much as they're trained on what to do

figloalds avatar Feb 23 '23 01:02 figloalds