stagehand icon indicating copy to clipboard operation
stagehand copied to clipboard

leave comments for eval performance

Open sameelarif opened this issue 1 year ago • 4 comments

why

We want an easy way to view eval category performance

what changed

Added comments from the CI, which log the performance of each eval category

test plan

Run existing evals and observe if comments are left by CI

sameelarif avatar Jan 07 '25 02:01 sameelarif

⚠️ No Changeset found

Latest commit: 331190f39606cf51a3c0d0c9847a6314c0a70d10

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

changeset-bot[bot] avatar Jan 07 '25 02:01 changeset-bot[bot]

🧪 E2E Test Results

✅ All E2E tests passed successfully

github-actions[bot] avatar Jan 07 '25 02:01 github-actions[bot]

🔄 Combination Eval Results

Score: 78% View detailed results

github-actions[bot] avatar Jan 07 '25 02:01 github-actions[bot]

🧪 E2E Test Results

✅ All E2E tests passed successfully

github-actions[bot] avatar Jan 08 '25 18:01 github-actions[bot]

closing, don't need rn

kamath avatar Jan 20 '25 01:01 kamath