Add submissions for BERT_IdentifyIA_HF and RoBERTa_IdentifyIA_HF detectors
Hi @liamdugan👋
This PR includes two new submissions for evaluation:
- BERT_IdentifyIA_HF
- RoBERTa_IdentifyIA_HF
Both detectors were fine-tuned on data set of kaggle (https://www.kaggle.com/datasets/shanegerami/ai-vs-human-text) and include predictions.json and metadata.json files following the leaderboard structure.
Apologies for the earlier submission issues; the folder structure has been fixed according to your guidance.
Thanks again for your help!
Eval run succeeded! Link to run: link
Here are the results of the submission(s):
RoBERTa_IdentifyIA_HF
Release date: 2025-02-01
I've committed detailed results of this detector's performance on the test set to this PR.
[!WARNING] Failed to find threshold values that achieve False Positive Rate(s): (['1%']) on all domains. This submission will not appear in the main leaderboard for those FPR values; it will only be visible within the splits in which the target FPR was achieved. On the RAID dataset as a whole (aggregated across all generation models, domains, decoding strategies, repetition penalties, and adversarial attacks), it achieved an AUROC of 61.27 and a TPR of 20.66% at FPR=5%. Without adversarial attacks, it achieved AUROC of 60.74 and a TPR of 22.61% at FPR=5%.
BERT_IdentifyIA_HF
Release date: 2025-02-01
I've committed detailed results of this detector's performance on the test set to this PR.
[!WARNING] Failed to find threshold values that achieve False Positive Rate(s): (['5%', '1%']) on all domains. This submission will not appear in the main leaderboard for those FPR values; it will only be visible within the splits in which the target FPR was achieved.
If all looks well, a maintainer will come by soon to merge this PR and your entry/entries will appear on the leaderboard. If you need to make any changes, feel free to push new commits to this PR. Thanks for submitting to RAID!
Hi @Gwyn9 just to add a bit more context for this evaluation result. It seems like there does not exist a threshold for which your classifiers get 99% accuracy on human-written text across all domains. I suggest investigating the results.json file to see which domains your classifier has issues on.
I'm happy to answer any more questions if you have them.
Hi @liamdugan 👋, I’m submitting two updated detector versions (V2) for evaluation on RAID. These models are fine-tuned versions of our previous submissions, now uploaded to Hugging Face and evaluated locally before submission.
Models:
- KewynG/Bert_IdentifyIA_V2
- KewynG/RoBERTa_IdentifyIA_V2
Changes:
- Retrained on a balanced and extended dataset derived from RAID subsets.
- Improved cleaning, tokenizer alignment, and balanced class sampling.
- Adjusted hyperparameters and checkpoints for better AUROC and recall stability.
Submission files:
- leaderboard/submissions/BERT_IdentifyIA_HF_V2/
- leaderboard/submissions/RoBERTa_IdentifyIA_HF_V2/
These submissions follow the same format as previous ones (metadata.json + predictions.json), now corresponding to the V2 detectors on Hugging Face.
Thanks again for maintaining the RAID benchmark, looking forward to seeing how these updated models perform compared to the previous V1 results.
It looks like this eval run failed. Please check the workflow logs to see what went wrong, then push a new commit to your PR to rerun the eval.
Hi @liamdugan 👋,
I've pushed an update to fix the UTF-8 BOM issue that caused the previous evaluation to fail during the hydrate.py decoding step.
The affected files were the metadata.json and predictions.json for both:
- BERT_IdentifyIA_HF_V2
- RoBERTa_IdentifyIA_HF_V2
They have now been re-saved using UTF-8 (no BOM) encoding and recommitted. Could you please re-run the evaluation for these updated submissions?
Thanks again for your help and for maintaining RAID! 🚀
It looks like this eval run failed. Please check the workflow logs to see what went wrong, then push a new commit to your PR to rerun the eval.
Hello @Gwyn9 it seems like both the BERT_IdentifyIA_HF_V2 and RoBERTa_IdentifyIA_HF_V2 metadata.json and predictions.json files were empty (0 lines). It looks like while editing the encoding you may have erased the contents of the files as well. Can you reupload them with the original contents in the new encoding?
Hi @liamdugan,
Restored full predictions.json files for BERT_IdentifyIA_HF_V2 and RoBERTa_IdentifyIA_HF_V2. Both ~58 MB, valid UTF-8 (BOM removed), verified via json.tool. Rebased cleanly and pushed to main; workflow ready for automatic re-evaluation.
Thank you!!
Eval run succeeded! Link to run: link
Here are the results of the submission(s):
RoBERTa_IdentifyIA_HF_V2
Release date: 2025-02-01
I've committed detailed results of this detector's performance on the test set to this PR.
On the RAID dataset as a whole (aggregated across all generation models, domains, decoding strategies, repetition penalties, and adversarial attacks), it achieved an AUROC of 79.26 and a TPR of 40.50% at FPR=5% and 19.97% at FPR=1%. Without adversarial attacks, it achieved AUROC of 84.04 and a TPR of 45.51% at FPR=5% and 22.91% at FPR=1%.
BERT_IdentifyIA_HF_V2
Release date: 2025-02-01
I've committed detailed results of this detector's performance on the test set to this PR.
On the RAID dataset as a whole (aggregated across all generation models, domains, decoding strategies, repetition penalties, and adversarial attacks), it achieved an AUROC of 85.65 and a TPR of 63.22% at FPR=5% and 48.17% at FPR=1%. Without adversarial attacks, it achieved AUROC of 89.44 and a TPR of 70.74% at FPR=5% and 54.87% at FPR=1%.
If all looks well, a maintainer will come by soon to merge this PR and your entry/entries will appear on the leaderboard. If you need to make any changes, feel free to push new commits to this PR. Thanks for submitting to RAID!
Hey @Gwyn9 would you like me to merge this into RAID?