Liang
Liang
Added "Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators" in the 'Capacity Evaluation' section.
Added references to papers on the reliability of LLM and order-invariance training.
Included a reference concerning the reliability of LLMs as generative search engines, hope it is relevant :)
Add new paper
### Required prerequisites - [X] I have read the documentation . - [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/beavertails/issues) and [Discussions](https://github.com/PKU-Alignment/beavertails/discussions) that this hasn't already been reported. (+1 or comment...