Add Support for indexing of large codebases
Pre-submit Checks
- [x] I have searched Warp feature requests and there are no duplicates
- [x] I have searched Warp docs and my feature is not there
Describe the solution you'd like?
The "Codebase Index" feature should be able to index large codebases.
Is your feature request related to a problem? Please describe.
Most of my codebases are very large and the agent can't handle it. Having an vector index would be helpful.
Additional context
Running a local encoding model for indexing is likely required.
Operating system (OS)
Windows
How important is this feature to you?
3
Warp Internal (ignore) - linear-label:39cc6478-1249-4ee7-950b-c428edfeecd1
None
Thanks for this feature request!
To anyone else interested in this feature, please add a 👍 to the original post at the top to signal that you want this feature, and subscribe if you'd like to be notified.
I just created a new project for backing up my warp dev envt and the "Codebase Index" states that the Codebase is too large ..
I would be happy if it could be indexed ... however with a better understanding on the functionality, I might be better able to avoid this in future projects.
Operating system (OS) Linux; Ubuntu 24.04
How important is this feature to you? 3
@sworley @MovGP0 We would appreciate the following information about the larger codebases to better help us understand the use case:
- its depth
- number of files
- type of files it contains
- how it's structured
- etc., anything else you think is important for us to know
Also note that large files like virtual disk images or ISO's that aren't used for coding may affect the ability to index the codebase as well, so we recommend keeping those seperate.
I'll give you exact numbers next week. I think the most important part is that build artifacts and packages (which is most of the folder size) need to be ignored, so I'd recommend to respect the .gitignore files in a given Git repository (including ignore files in subfolders).
Further, support for MCP and/or A2A might also mitigate the problem, since users would be able to provide their own indexes.
Some statistics about the codebase I am working on currently:
Languages:
- C# (99.6 %)
- PowerShell (0.2 %)
- Visual Basic (0.2 %)
Lines of code: 1,706,418
Number of files: 23,612 files
Size: 1.274 GiB
Structure
Depth: up to 6 directory sublevels
Structure: one subfolder per project (ie. *.csproj file)
File types
Most Common File Types:
• .cs (C# source files)
• .resx (Resource files)
• .png (PNG images)
• .xml (XML files)
Other Notable File Types:
- .svg (SVG images)
- .csproj (C# project files)
- .bmp (Bitmap images)
- .ps1 (PowerShell scripts)
- .config (Configuration files)
Development & Build Files:
- .nsi/.nsh (NSIS installer files)
- .settings (Settings files)
- .licx (License files)
Documents & Media:
- .txt (Text files)
- .ico (Icon files)
- .jpg (JPEG images)
- .xlsx (Excel files)
- .docx (Word documents)
Other file types
- various CAD file formats (IGES, STP/STEP, DWG, etc.)
[!Note] The codebase is actually bigger, because it's ditributed about multiple git repositories; referencing other indices for the agent might be required for understanding the full context
[!Note] Build artifacts (
/objand/bin) directories have been excluded for this statistics/size calculations
I'm in the same boat as @MovGP0 except:
- 8M lines of code split: Java (5M), JavaScript + TypeScript (3M)
- DB Migration (Flyway): 1925 migrations
- Single repository, not mono
- Atlassian toolchain
- Multiple build artifacts distributed across 20+ servers (in-house, our own DC)
The way I'm doing it currently is having an index using the Repomix CLI to create an local index and using Repomix`s MCP API for LLM queries. Unfortunately that is something that Warp does not support yet.
I don't even have a large codebase (I think). Cursor says it's only 475 files after .gitignore and .cursorignore but Warp cannot index my codebase. Is it actually large or it doesn't respect my .gitignore? Also having an additional ignore file like cursor would be great.
Cursor can index my Unity project, but Warp doesn't.
I can't use it in my Unity project at present. The total number of files in my Asset folder exceeds 20,000. Even the Turbo plan can't handle this situation. It contains a large number of meta files that should be ignored.
What's missing is the ability to ignore certain paths besides .gitignore.
Also the "Codebase index" settings page could display the total number of files it wants to index, not just "too large".
What's missing is the ability to ignore certain paths besides
.gitignore.Also the "Codebase index" settings page could display the total number of files it wants to index, not just "too large".
This! Game projects have a lot of asset files that can't be indexed. I'm not sure already Warp ignores binary files, but at least with Unity it's very common that these files are all text.
Also ignoring certain paths won't be enough, there are meta files along the regular files. We need a .warpignore.
There is a .warpindexingore that I've successfully used to index a git repo that was reported as too large. I excluded some directories I didn't care about and it reduced the file count to less than 10k which is the limit for the pro plan.
Missatge de Roberto Caldas @.***> del dia dg., 13 de jul. 2025 a les 20:30:
robertocaldas left a comment (warpdotdev/Warp#6586) https://github.com/warpdotdev/Warp/issues/6586#issuecomment-3067577611
What's missing is the ability to ignore certain paths besides .gitignore.
Also the "Codebase index" settings page could display the total number of files it wants to index, not just "too large".
This! Game projects have a lot of asset files that can't be indexed. I'm not sure already Warp ignores binary files, but at least with Unity it's very common that these files are all text.
Also ignoring certain paths won't be enough, there are meta files along the regular files. We need a .warpignore.
— Reply to this email directly, view it on GitHub https://github.com/warpdotdev/Warp/issues/6586#issuecomment-3067577611, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAHWPSCKEASQCF5LW24U7L3IMI4ZAVCNFSM6AAAAAB7B4EB2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTANRXGU3TONRRGE . You are receiving this because you are subscribed to this thread.Message ID: @.***>
There is a .warpindexingore that I've successfully used to index a git repo that was reported as too large. I excluded some directories I didn't care about and it reduced the file count to less than 10k which is the limit for the pro plan.
Missatge de Roberto Caldas @.***> del dia dg., 13 de jul. 2025 a les 20:30: …
That didn't work for me, can you please provide more information? I added the file in the repo root, pressed the sync button in Preferences, closed and opened Warp, it doesn't seem to update.
What you did should work if you've excluded directories to reduce the file count. Here's what my .warpindexingignore looks like:
ansible/ schema/ web/ infrastructure/
That excludes those directories and brings the index count under the threshold. Make sure you have the ignore filename spelled properly and the directories.
There was a typo in my ignore filename .warp indexing ignore I was using .warpindexignore
I changed the filename to .warpindexingignore and it works.
Yes, that's one of the documented ignore files
The "Ingore file" section lists the options down in this page: https://docs.warp.dev/code/codebase-context
For game projects, I think an option should be provided to specify which files to include only. There are too many other types of files in game engines, and there are actually only a few types of code files.
Codebase index file limit is given too little. The cursor shows a limit of 50000 files, but in reality, it can still run with more than 100000 files.