In-Line Obfuscation and Content Moderation Can Not Log Economically - Constraints in reporting
In designing the training and inference pipelines available to users, protecting the models from content liability can be implemented on the input (preventing training data and prompts from getting to the preprocessor) or on the output (screening assembled responses).
Within the context of implementation, maintenance, and improvement, these processes have to potential to expose a lot of sensitive user-generated information (someone's searches/responses can be very private. So too can data used for fine-tuning). To this end, centralized logging may not be a good solution, instead opting for a 'build option' or similar to include some kind of logging platform (elasticsearch/kibana if a flag is set for example, but only stored locally or attached to a paid cluster).
Further discussion may be warranted to address the following:
- System logging architecture
- Pluggable build system
- Content moderation opportunities/hooks (input/output/negative-reinforcement)
I would like to focus this discussion on solutions external to the nets themselves. What can we do on the training pipe? The inference pipe? Data at rest? And how do we allow reporting of moderation failure(s) with transparent no-log or explicit opt-in logging of search/response and training data/metadata.
Notes: RE: "i feel uncomfrotable having to monitor open assistance loaded on people's machiine. Not because I don't think CSAM prevention is unimportant, but the use of surveliance tech could be abused."
We don't have to monitor to filter. There's no need to log findings, just prevent them from getting into the training/inference stream to some extent. Even if they were logged, where would they log to? OpenAssistant could be deployed to someone's home server or PC, so who's gonna pay for the terabytes of logs that would generate.
thank you for creating this. Let's discuss and see how we can have this as an option if people want to use it :)