embabel-agent
embabel-agent copied to clipboard
Guardrail mechanism for `UserContent`
UserContent represents text coming into the system from users. This may be malicious or toxic. We should have the ability to apply consistent guardrails here.
Hi! I would like to work on this if it's okay. :) I think the best approach would be to create a chain of pluggable guardrails, starting from simple static ones to more complex model-driven guardrails, that can be applied to the user inputs.
Also, I think this would introduce some latency, especially in LLM guardrails if we decide to incorporate. Would suggest a caching mechanism as well.