So does this need retraining on each new dataset?
This looks interesting, I have been working with graphRAG for some time now and feel I am getting pretty great results but always looking for more!
It looks like this needs retraining each time for GPRO which negates a lot of the use cases we would be able to use it for (new documents arriving, different corpus types etc).
Is there any alternative to this or some form of general purpose GPRO that can be applied or would it need to be trained on the specific dataset?
Thanks
Thank you for your interest and for sharing your experience! In our current setup, for the same task (e.g., same query type and reasoning requirement), updating the knowledge graph with new documents does not require retraining — the training phase mainly shapes the agent’s reasoning pattern, which already has a certain degree of generalization. We typically only retrain when adapting to different task requirements that call for substantially different reward designs.
That said, we agree that a “general-purpose” GPRO model that can flexibly handle varied corpora without task-specific retraining would be valuable, and we plan to explore whether such an all-in-one foundation model is feasible in our future work.
My current setup moves between GraphRAG on things like Laws and Legislation, Research Papers, General information and Financial based items.
I think they are different enough to require GPRO training on each of these to increase effectiveness but a custom foundation model would be much better for implementation, especially as it's very heavily inference based and hosting custom trained models for this blows out the value prop and expenses!
Agree.