ocel interleavings discovery throws TypeError
Follow the example in documentation for ocel interleavings discovery. https://pm4py.fit.fraunhofer.de/documentation#object-centric-event-logs using:
- python 3.11 on windows 11 x64
- pm4py 2.7.11.4
- pandas 2.1.0
dataframe1 = pd.read_csv("tests/input_data/interleavings/receipt_even.csv")
dataframe1 = pm4py.format_dataframe(dataframe1)
dataframe2 = pd.read_csv("tests/input_data/interleavings/receipt_odd.csv")
dataframe2 = pm4py.format_dataframe(dataframe2)
case_relations = pd.read_csv("tests/input_data/interleavings/case_relations.csv")
from pm4py.algo.discovery.ocel.interleavings import algorithm as interleavings_discovery
interleavings = interleavings_discovery.apply(dataframe1, dataframe2, case_relations)
gives following error, with below: TypeError: Passing a set as an indexer is not supported. Use a list instead.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[6], [line 11](vscode-notebook-cell:?execution_count=6&line=11)
[8](vscode-notebook-cell:?execution_count=6&line=8) interleavings = interleavings_discovery.apply(dataframe1, dataframe2, case_relations)
[10](vscode-notebook-cell:?execution_count=6&line=10) from pm4py.objects.ocel.util import log_ocel
---> [11](vscode-notebook-cell:?execution_count=6&line=11) ocel = log_ocel.from_interleavings(dataframe1, dataframe2, interleavings)
File [c:\Users\malshan\AppData\Local\Programs\Python\Python311\Lib\site-packages\pm4py\objects\ocel\util\log_ocel.py:264](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:264), in from_interleavings(df1, df2, interleavings, parameters)
[261](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:261) right_index = exec_utils.get_param_value(Parameters.RIGHT_INDEX, parameters, "@@right_index")
[262](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:262) direction = exec_utils.get_param_value(Parameters.DIRECTION, parameters, "@@direction")
--> [264](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:264) events1 = __get_events_dataframe(df1, activity_key, timestamp_key, case_id_key, case_attribute_prefix,
[265](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:265) events_prefix="E1_")
[266](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:266) objects1 = __get_objects_dataframe(df1, case_id_key, case_attribute_prefix, target_object_type)
[267](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:267) relations1 = __get_relations_from_events(events1, target_object_type)
File [c:\Users\malshan\AppData\Local\Programs\Python\Python311\Lib\site-packages\pm4py\objects\ocel\util\log_ocel.py:144](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:144), in __get_events_dataframe(df, activity_key, timestamp_key, case_id_key, case_attribute_prefix, events_prefix)
[140](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:140) """
[141](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:141) Internal method to get the events dataframe out of a traditional log stored as Pandas dataframe
[142](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:142) """
[143](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:143) columns = {case_id_key}.union**(set(x for x in df.columns if not x.startswith(case_attribute_prefix)))**
--> [144](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:144) df = df[columns]
[145](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:145) df = df.rename(columns={activity_key: ocel_constants.DEFAULT_EVENT_ACTIVITY,
[146](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:146) timestamp_key: ocel_constants.DEFAULT_EVENT_TIMESTAMP,
[147](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:147) case_id_key: ocel_constants.DEFAULT_OBJECT_ID})
[148](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:148) df[ocel_constants.DEFAULT_EVENT_ID] = events_prefix + df.index.astype(str)
...
[2696](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pandas/core/indexing.py:2696) raise TypeError(
[2697](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pandas/core/indexing.py:2697) "Passing a dict as an indexer is not supported. Use a list instead."
[2698](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pandas/core/indexing.py:2698) )
same reproduced using pm4py-2.7.11.8
Dear @lmpeiris
At least with the current version of Pandas, such code runs without hassles
Could you update Pandas to 2.2.2 and retry ? (pip install -U pandas)
same using pandas 2.2.2 is it similar to this issue (from another project) https://github.com/aertslab/pycistarget/issues/7
i will also try in python 3.10 and let you know
A question: do you have NVIDIA RAPIDS / CUDF installed? CUDF is automatically privileged over Pandas if installed, and does not support set as indexers.
I cheked they are not installed. also laptop do not have nvidia hardware. I'll be able to test this in python 3.10 in 8 hours, will post the result of it.
works fine on python 3.10 win11 x64 (on a different machine) using same pandas and pm4py versions
Despite not being able to reproduce your issue, I closed some of the possible root causes, which will make part of the next intermediate release of pm4py.