embabel-agent icon indicating copy to clipboard operation
embabel-agent copied to clipboard

Guardrail mechanism for `UserContent`

Open johnsonr opened this issue 3 months ago • 2 comments

UserContent represents text coming into the system from users. This may be malicious or toxic. We should have the ability to apply consistent guardrails here.

johnsonr avatar Oct 25 '25 16:10 johnsonr

Hi! I would like to work on this if it's okay. :) I think the best approach would be to create a chain of pluggable guardrails, starting from simple static ones to more complex model-driven guardrails, that can be applied to the user inputs.

Also, I think this would introduce some latency, especially in LLM guardrails if we decide to incorporate. Would suggest a caching mechanism as well.

harinda05 avatar Nov 18 '25 12:11 harinda05