NeMo-Guardrails icon indicating copy to clipboard operation
NeMo-Guardrails copied to clipboard

docs: Add link to eval blog

Open mikemckiernan opened this issue 11 months ago • 3 comments

Description

Aditi's blog supplements the eval information we have in the docs for input and output rails.

Related Issue(s)

Checklist

  • [ ] I've read the CONTRIBUTING guidelines.
  • [ ] I've updated the documentation if applicable.
  • [ ] I've added tests if applicable.
  • [ ] @mentions of the person or team responsible for reviewing proposed changes.

mikemckiernan avatar May 08 '25 13:05 mikemckiernan

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 68.24%. Comparing base (b65cf0e) to head (3dfbbc4).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #1180   +/-   ##
========================================
  Coverage    68.24%   68.24%           
========================================
  Files          161      161           
  Lines        15938    15938           
========================================
  Hits         10877    10877           
  Misses        5061     5061           
Flag Coverage Δ
python 68.24% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov-commenter avatar May 08 '25 13:05 codecov-commenter

Hi @mikemckiernan ,

Actually the evaluation documentation probably needs a bit of changes. Right now we have two different type of evaluation tools and corresponding documentation :

  1. An eval tool and docs mainly intended for researchers and it shows how to evaluate individual rails (e.g. dialogue, content moderation, fact-checking). This is the documentation you are changing in the current commit.
  2. An end-to-end evaluation for guardrail configs that support different rails, this is also called policy-based evaluation. The docs for this eval tools are here: docs/user-guides/eval/methodology.md

The mention to the blog post should be done in the docs for #2, no? Any restructuring ideas for these two eval tools and documentation would also be useful.

Thanks!

trebedea avatar May 08 '25 15:05 trebedea