The message "yes" is blocked by the input rail.
The current implementation can block any message that contains "yes" if the LLM decides to repeat the message and to respond with something like "No, the user message 'Yes' should not be blocked".
The self check logic needs to be improved to:
- Generate a smaller number of token
- If the response starts with yes or no, it should ignore the rest.
Hi @drazvan, I would like to contribute on this issue. Do you think this could potentially be a good first issue to tackle? Do you have further relevant information for me to be able to look into this? If this is not a good first issue, can you point me to what you believe is something I can contribute on is?
Hi @ajanitshimanga! I think we have this one already in progress. Can you try to pick up on this one instead: https://github.com/NVIDIA/NeMo-Guardrails/issues/277? Thanks!
Fixed by #674