graphrag icon indicating copy to clipboard operation
graphrag copied to clipboard

[Bug]: <title>❌ Errors occurred during the pipeline run, see logs for more details.

Open shreyn07 opened this issue 1 year ago • 5 comments

Describe the bug

11:17:11,575 graphrag.index.run ERROR error running workflow create_final_community_reports Traceback (most recent call last): File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\graphrag\index\run.py", line 323, in run_pipeline result = await workflow.run(context, callbacks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\datashaper\workflow\workflow.py", line 369, in run timing = await self._execute_verb(node, context, callbacks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\datashaper\workflow\workflow.py", line 410, in _execute_verb result = node.verb.func(**verb_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\datashaper\engine\verbs\window.py", line 73, in window window = __window_function_mapwindow_operation

File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\frame.py", line 4102, in getitem
indexer = self.columns.get_loc(key)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\shrnema\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexes\range.py", line 417, in get_loc
raise KeyError(key)
KeyError: 'community'
11:17:11,578 graphrag.index.reporting.file_workflow_callbacks INFO Error running pipeline! details=None

### Steps to reproduce

_No response_

### Expected Behavior

_No response_

### GraphRAG Config Used

_No response_

### Logs and screenshots

_No response_

### Additional Information

- GraphRAG Version:
- Operating System:
- Python Version:
- Related Issues:

shreyn07 avatar Jul 10 '24 10:07 shreyn07

The same issue, and here is the logs:

{'data': 'Error running pipeline!', 'details': 'null', 'source': "'community'", 'stack': 'Traceback (most recent call last):\n' ' File ' '"/root/miniconda3/envs/graphrag/lib/python3.11/site-packages/graphrag/index/run.py", ' 'line 323, in run_pipeline\n' ' result = await workflow.run(context, callbacks)\n' ' ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n' ' File ' '"/root/miniconda3/envs/graphrag/lib/python3.11/site-packages/datashaper/workflow/workflow.py", ' 'line 369, in run\n' ' timing = await self._execute_verb(node, context, callbacks)\n' ' ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n' ' File ' '"/root/miniconda3/envs/graphrag/lib/python3.11/site-packages/datashaper/workflow/workflow.py", ' 'line 410, in _execute_verb\n' ' result = node.verb.func(**verb_args)\n' ' ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n' ' File ' '"/root/miniconda3/envs/graphrag/lib/python3.11/site-packages/datashaper/engine/verbs/window.py", ' 'line 73, in window\n' ' window = ' '__window_function_mapwindow_operation\n' ' ' '~~~~~~~~~~~^^^^^^^^\n' ' File ' '"/root/miniconda3/envs/graphrag/lib/python3.11/site-packages/pandas/core/frame.py", ' 'line 4102, in getitem\n' ' indexer = self.columns.get_loc(key)\n' ' ^^^^^^^^^^^^^^^^^^^^^^^^^\n' ' File ' '"/root/miniconda3/envs/graphrag/lib/python3.11/site-packages/pandas/core/indexes/range.py", ' 'line 417, in get_loc\n' ' raise KeyError(key)\n' "KeyError: 'community'\n", 'type': 'error'}

cdg1921 avatar Jul 12 '24 09:07 cdg1921

same issue: [226 rows x 5 columns] 🚀 create_base_extracted_entities entity_graph 0 <graphml xmlns="http://graphml.graphdrawing.or... 🚀 create_summarized_entities entity_graph
0 <graphml xmlns="http://graphml.graphdrawing.or...
❌ create_base_entity_graph None ⠧ GraphRAG Indexer ├── Loading Input (InputFileType.text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00 ├── create_base_text_units ├── create_base_extracted_entities ├── create_summarized_entities └── create_base_entity_graph ❌ Errors occurred during the pipeline run, see logs for more details.

SeanFeng91 avatar Jul 15 '24 08:07 SeanFeng91

same error

{"type": "error", "data": "Error executing verb \"window\" in create_final_community_reports: 'community'", "stack": "Traceback (most recent call last):\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/datashaper/workflow/workflow.py\", line 410, in _execute_verb\n result = node.verb.func(**verb_args)\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/datashaper/engine/verbs/window.py\", line 73, in window\n window = __window_function_map[window_operation](input_table[column])\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/frame.py\", line 4102, in __getitem__\n indexer = self.columns.get_loc(key)\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/indexes/range.py\", line 417, in get_loc\n raise KeyError(key)\nKeyError: 'community'\n", "source": "'community'", "details": null} {"type": "error", "data": "Error running pipeline!", "stack": "Traceback (most recent call last):\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/graphrag/index/run.py\", line 323, in run_pipeline\n result = await workflow.run(context, callbacks)\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/datashaper/workflow/workflow.py\", line 369, in run\n timing = await self._execute_verb(node, context, callbacks)\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/datashaper/workflow/workflow.py\", line 410, in _execute_verb\n result = node.verb.func(**verb_args)\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/datashaper/engine/verbs/window.py\", line 73, in window\n window = __window_function_map[window_operation](input_table[column])\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/frame.py\", line 4102, in __getitem__\n indexer = self.columns.get_loc(key)\n File \"/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/indexes/range.py\", line 417, in get_loc\n raise KeyError(key)\nKeyError: 'community'\n", "source": "'community'", "details": null}

ricardowu1112 avatar Jul 15 '24 08:07 ricardowu1112

File "/opt/anaconda3/envs/graphrag/lib/python3.11/site-packages/pandas/core/indexes/range.py", line 417, in get_loc raise KeyError(key) KeyError: 'community' 00:53:41,724 graphrag.index.reporting.file_workflow_callbacks INFO Error running pipeline! details=None Same error, is there a support for this??

amitguptadumka avatar Jul 15 '24 19:07 amitguptadumka

Are you all guuys using Azure or colab

ayushjadia avatar Jul 16 '24 07:07 ayushjadia

same error

sadimoodi avatar Jul 17 '24 02:07 sadimoodi

In settings.yml. Change comment out the line model supports json it will work.

shreyn07 avatar Jul 17 '24 03:07 shreyn07

In settings.yml. Change comment out the line model supports json it will work.

change it to what? to False? i already did that and it doesnt work

sadimoodi avatar Jul 17 '24 03:07 sadimoodi

Just comment that line

shreyn07 avatar Jul 17 '24 03:07 shreyn07

its not working @shreyn07 . Can you paste code snippet.

amitguptadumka avatar Jul 19 '24 07:07 amitguptadumka

same error, please save me. TAT

ghwang1999 avatar Jul 19 '24 07:07 ghwang1999

cmd: return bound(*args, **kwds) 🚀 create_base_text_units id ... n_tokens 0 39ac36c36504c6e966e37cefb41d1168 ... 100 1 0cc4a30f79ae4bdbddd35f6ca8f7fe55 ... 100 2 6b56a4b6d0f57039349bcc686254e2d2 ... 100 3 19d681644edc65f5961e91f3ea638f96 ... 100 4 deaf6db8c953b126a13115f3a2c56f58 ... 100 .. ... ... ... 215 153069ad0d3e07a803fe9326224fd298 ... 100 216 c7170c4985a9f8e6066b10826e0bb4cf ... 100 217 d8892d092a53a07d780633678e35157a ... 100 218 e4266030abf9bc17de8d70905e6a224c ... 95 219 c91fcd42640cb8c096e8d23ca110f22d ... 25

[660 rows x 5 columns] 🚀 create_base_extracted_entities entity_graph 0 <graphml xmlns="http://graphml.graphdrawing.or... 🚀 create_summarized_entities entity_graph 0 <graphml xmlns="http://graphml.graphdrawing.or... ❌ create_base_entity_graph None ⠼ GraphRAG Indexer ├── Loading Input (InputFileType.text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00 ├── create_base_text_units ├── create_base_extracted_entities ├── create_summarized_entities └── create_base_entity_graph ❌ Errors occurred during the pipeline run, see logs for more details.

logs: {"type": "error", "data": "Entity Extraction Error", "stack": "Traceback (most recent call last):\n File "/Applications/0PFile/ProjectPython/graphrag/graphrag/index/graph/extractors/graph/graph_extractor.py", line 118, in call\n result = await self._process_document(text, prompt_variables)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/Applications/0PFile/ProjectPython/graphrag/graphrag/index/graph/extractors/graph/graph_extractor.py", line 146, in _process_document\n response = await self._llm(\n ^^^^^^^^^^^^^^^^\n File "/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/openai/json_parsing_llm.py", line 34, in call\n result = await self._delegate(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/openai/openai_token_replacing_llm.py", line 37, in call\n return await self._delegate(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/openai/openai_history_tracking_llm.py", line 33, in call\n output = await self._delegate(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/base/caching_llm.py", line 104, in call\n result = await self._delegate(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/base/rate_limiting_llm.py", line 177, in call\n result, start = await execute_with_retry()\n ^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/base/rate_limiting_llm.py", line 159, in execute_with_retry\n async for attempt in retryer:\n File "/opt/anaconda3/envs/ghw/lib/python3.12/site-packages/tenacity/asyncio/init.py", line 166, in anext\n do = await self.iter(retry_state=self._retry_state)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/opt/anaconda3/envs/ghw/lib/python3.12/site-packages/tenacity/asyncio/init.py", line 153, in iter\n result = await action(retry_state)\n ^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/opt/anaconda3/envs/ghw/lib/python3.12/site-packages/tenacity/_utils.py", line 99, in inner\n return call(*args, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^\n File "/opt/anaconda3/envs/ghw/lib/python3.12/site-packages/tenacity/init.py", line 398, in \n self._add_action_func(lambda rs: rs.outcome.result())\n ^^^^^^^^^^^^^^^^^^^\n File "/opt/anaconda3/envs/ghw/lib/python3.12/concurrent/futures/_base.py", line 449, in result\n return self.__get_result()\n ^^^^^^^^^^^^^^^^^^^\n File "/opt/anaconda3/envs/ghw/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result\n raise self._exception\n File "/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/base/rate_limiting_llm.py", line 162, in execute_with_retry\n await self._rate_limiter.acquire(input_tokens)\n File "/Applications/0PFile/ProjectPython/graphrag/graphrag/llm/limiting/tpm_rpm_limiter.py", line 32, in acquire\n await self._tpm_limiter.acquire(num_tokens)\n File "/opt/anaconda3/envs/ghw/lib/python3.12/site-packages/aiolimiter/leakybucket.py", line 95, in acquire\n raise ValueError("Can't acquire more than the maximum capacity")\nValueError: Can't acquire more than the maximum capacity\n", "source": "Can't acquire more than the maximum capacity", "details": {"doc_index": 0, "text": "\ufeffThe Project Gutenberg eBook of A Christmas Carol\n \nThis ebook is for the use of anyone anywhere in the United States and\nmost other parts of the world at no cost and with almost no restrictions\nwhatsoever. You may copy it, give it away or re-use it under the terms\nof the Project Gutenberg License included with this ebook or online\nat www.gutenberg.org. If you are not located in the United States,\nyou will have to check the laws of the country where you are"}}

ghwang1999 avatar Jul 19 '24 07:07 ghwang1999

{"type": "error", "data": "Error executing verb "cluster_graph" in create_base_entity_graph: Columns must be same length as key", "stack": "Traceback (most recent call last):\n File "/usr/local/lib/python3.12/site-packages/datashaper/workflow/workflow.py", line 410, in _execute_verb\n result = node.verb.func(**verb_args)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/usr/local/lib/python3.12/site-packages/graphrag/index/verbs/graph/clustering/cluster_graph.py", line 102, in cluster_graph\n output_df[[level_to, to]] = pd.DataFrame(\n ~~~~~~~~~^^^^^^^^^^^^^^^^\n File "/usr/local/lib/python3.12/site-packages/pandas/core/frame.py", line 4299, in setitem\n self._setitem_array(key, value)\n File "/usr/local/lib/python3.12/site-packages/pandas/core/frame.py", line 4341, in _setitem_array\n check_key_length(self.columns, key, value)\n File "/usr/local/lib/python3.12/site-packages/pandas/core/indexers/utils.py", line 390, in check_key_length\n raise ValueError("Columns must be same length as key")\nValueError: Columns must be same length as key\n", "source": "Columns must be same length as key", "details": null} {"type": "error", "data": "Error running pipeline!", "stack": "Traceback (most recent call last):\n File "/usr/local/lib/python3.12/site-packages/graphrag/index/run.py", line 323, in run_pipeline\n result = await workflow.run(context, callbacks)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/usr/local/lib/python3.12/site-packages/datashaper/workflow/workflow.py", line 369, in run\n timing = await self._execute_verb(node, context, callbacks)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/usr/local/lib/python3.12/site-packages/datashaper/workflow/workflow.py", line 410, in _execute_verb\n result = node.verb.func(**verb_args)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/usr/local/lib/python3.12/site-packages/graphrag/index/verbs/graph/clustering/cluster_graph.py", line 102, in cluster_graph\n output_df[[level_to, to]] = pd.DataFrame(\n ~~~~~~~~~^^^^^^^^^^^^^^^^\n File "/usr/local/lib/python3.12/site-packages/pandas/core/frame.py", line 4299, in setitem\n self._setitem_array(key, value)\n File "/usr/local/lib/python3.12/site-packages/pandas/core/frame.py", line 4341, in _setitem_array\n check_key_length(self.columns, key, value)\n File "/usr/local/lib/python3.12/site-packages/pandas/core/indexers/utils.py", line 390, in check_key_length\n raise ValueError("Columns must be same length as key")\nValueError: Columns must be same length as key\n", "source": "Columns must be same length as key", "details": null}

yakeworld avatar Jul 21 '24 22:07 yakeworld

same problem

yurochang avatar Jul 22 '24 05:07 yurochang

same problem. not found comment config

armolee avatar Jul 31 '24 14:07 armolee

同样的问题: [226 行 x 5 列] 🚀 create_base_extracted_entities entity_graph 0 <graphml xmlns=“http://graphml.graphdrawing.or... 🚀 create_summarized_entities entity_graph 0 <graphml xmlns=“http://graphml.graphdrawing.or... ❌ create_base_entity_graph 无 ⠧ GraphRAG 索引器 ├── 加载输入 (InputFileType.text) - 已加载 1 个文件(0 个过滤) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00:00 ├── create_base_text_units ├── create_base_extracted_entities ├── create_summarized_entities └── create_base_entity_graph ❌ 管道运行过程中出现错误,请参阅日志了解更多详情。

有解决方案吗

night666e avatar Aug 09 '24 03:08 night666e

在 settings.yml.更改注释掉行模型支持json,它将起作用。

https://github.com/microsoft/graphrag/issues/485#issuecomment-2232281213 他这个貌似原本就是true,需要改flase吗

night666e avatar Aug 09 '24 03:08 night666e

Just comment that line

Not working. Small length of txt works,but big files failed. image

worstkid92 avatar Sep 21 '24 13:09 worstkid92