blogs icon indicating copy to clipboard operation
blogs copied to clipboard

Query regarding ms-graphrag.ipynb notebook

Open AnimeshSingh0 opened this issue 1 year ago • 9 comments

I recently came across your blogs repository and noticed that you have implemented the query-focused Graphrag approach. However, it appears that the implementation is incomplete.

I am currently seeking a comprehensive implementation of this approach and was excited to find your work. Could you please let me know if you had the opportunity to complete it? If so, I would greatly appreciate it if you could share your implementation. If not, any insights into the reasons for leaving it unfinished would be highly valuable.

AnimeshSingh0 avatar Jun 18 '24 16:06 AnimeshSingh0

This implementation is more or less complete, I just didnt have time to write about it yet

V tor., 18. jun. 2024, 09:09 je oseba Animesh Singh < @.***> napisala:

I recently came across your blogs repository and noticed that you have implemented the query-focused Graphrag approach. However, it appears that the implementation is incomplete.

I am currently seeking a comprehensive implementation of this approach and was excited to find your work. Could you please let me know if you had the opportunity to complete it? If so, I would greatly appreciate it if you could share your implementation. If not, any insights into the reasons for leaving it unfinished would be highly valuable.

— Reply to this email directly, view it on GitHub https://github.com/tomasonjo/blogs/issues/27, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYGGTIARDJSIHBLNLXPCVDZIBLR5AVCNFSM6AAAAABJQLIEE6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3DAMRSGU4DCMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

tomasonjo avatar Jun 18 '24 16:06 tomasonjo

Screenshot 2024-06-21 000520 When using Thread Pool executor to convert documents into graph documents, it gets stuck in between, I tried setting timeout value in future.result() and catching the timeout error, but it did not even raise the error, im not sure why does it get stuck in between, any help would be appreciated.

AnimeshSingh0 avatar Jun 21 '24 12:06 AnimeshSingh0

What documents are you trying to process?

V pet., 21. jun. 2024, 05:39 je oseba Animesh Singh < @.***> napisala:

Screenshot.2024-06-21.000520.png (view on web) https://github.com/tomasonjo/blogs/assets/99022540/a4e73982-2bd0-4bd5-ba27-811c4460b066 When using Thread Pool executor to convert documents into graph documents, it gets stuck in between, I tried setting timeout value in future.result() and catching the timeout error, but it did not even raise the error, im not sure why does it get stuck in between, any help would be appreciated.

— Reply to this email directly, view it on GitHub https://github.com/tomasonjo/blogs/issues/27#issuecomment-2182677274, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYGGTILBPNOWGF4GNKLKGTZIQNJLAVCNFSM6AAAAABJQLIEE6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBSGY3TOMRXGQ . You are receiving this because you commented.Message ID: @.***>

tomasonjo avatar Jun 21 '24 16:06 tomasonjo

The documents are 100-200 pages, and the I am trying to extract around 4 to 5 entity types with 10 possible relationships among them.

AnimeshSingh0 avatar Jun 22 '24 16:06 AnimeshSingh0

You need to chunk the documents to like 1-3k chunks

V sob., 22. jun. 2024, 09:28 je oseba Animesh Singh < @.***> napisala:

The documents are 100-200 pages, and the I am trying to extract around 4 to 5 entity types with 10 possible relationships among them.

— Reply to this email directly, view it on GitHub https://github.com/tomasonjo/blogs/issues/27#issuecomment-2184090057, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYGGTJ3MNCBFJBGV3JEUWLZIWQ4RAVCNFSM6AAAAABJQLIEE6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBUGA4TAMBVG4 . You are receiving this because you commented.Message ID: @.***>

tomasonjo avatar Jun 22 '24 16:06 tomasonjo

Already doing that, tried with 500-4000 sized chunks, it works fine with less number of documents of smaller size but gets stuck when I put all the chunks for processing.

AnimeshSingh0 avatar Jun 22 '24 16:06 AnimeshSingh0

How many chunks? Let me try to reproduce

V sob., 22. jun. 2024, 09:34 je oseba Animesh Singh < @.***> napisala:

Already doing that, tried with 500-4000 sized chunks, it works fine with less number of documents of smaller size but gets stuck when I put all the chunks for processing.

— Reply to this email directly, view it on GitHub https://github.com/tomasonjo/blogs/issues/27#issuecomment-2184091334, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYGGTOUZR27A5DCMMFOY4TZIWRQLAVCNFSM6AAAAABJQLIEE6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBUGA4TCMZTGQ . You are receiving this because you commented.Message ID: @.***>

tomasonjo avatar Jun 22 '24 16:06 tomasonjo

Sure, 250 chunks, each 2000 sized

AnimeshSingh0 avatar Jun 22 '24 17:06 AnimeshSingh0

any progress pls

Sandy4321 avatar Jan 03 '25 18:01 Sandy4321