feat: Add llms.txt support for AI-friendly documentation
Fixes #8117
Summary
Adds `llms.txt` and `llms-full.txt` generation following the llms.txt specification to ensure AI models can access the latest React documentation with proper structure and metadata.
What's Included
Two file formats generated at build time:
- llms.txt (13KB): Simple hierarchical link index covering 167 documentation pages
- llms-full.txt (2.7MB): Complete embedded documentation with full markdown content
Both files are automatically generated from:
- Sidebar configurations (`src/sidebarLearn.json`, `src/sidebarReference.json`)
- Markdown content in `src/content/`
Implementation Details
- Created `scripts/generate-llms-txt.js` following the same patterns as `scripts/generateRss.js`
- Integrated into build pipeline via `package.json` (runs automatically on `yarn build`)
- Added standalone `yarn llms` command for manual regeneration
- Uses existing `gray-matter` dependency for frontmatter extraction
Testing
Verified locally that:
- Both files generate successfully with correct content structure
- Files are served correctly at `/llms.txt` and `/llms-full.txt`
- Generator script follows repository code style conventions
- Build integration works without breaking existing build process
Files Changed
- `scripts/generate-llms-txt.js` - Generator script with Meta copyright header
- `public/llms.txt` - Simple format output
- `public/llms-full.txt` - Full embedded format output
- `package.json` - Build script integration
Hi @artimath!
Thank you for your pull request and welcome to our community.
Action Required
In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.
Process
In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.
Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.
If you have received this in error or have any questions, please contact us at [email protected]. Thanks!
Size changes
📦 Next.js Bundle Analysis for react-dev
This analysis was generated by the Next.js Bundle Analysis action. 🤖
This PR introduced no changes to the JavaScript bundle! 🙌
CLA signed.
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!
Awesome, thank you!
I'm not sure about the llms-full.txt version. I feel like if we provide this, we really should optimize it for LLM consumption, as no one is going to load the entire set of docs into context? What if we revert that and just go with the llms.txt?
@rickhanlonii
Awesome, thank you!
I'm not sure about the llms-full.txt version. I feel like if we provide this, we really should optimize it for LLM consumption, as no one is going to load the entire set of docs into context? What if we revert that and just go with the llms.txt?
Having both and let developers decide which one to feed to their AIs is probably better than not giving them the option to choose. NextJS for instance has a very big llms-full.txt too.
Yeah but it's formatted a bit better. I'm not saying we shouldn't ever have it, but it would be better to wait for the full version when we can do more than just dump the markdown files.