pm4py-core icon indicating copy to clipboard operation
pm4py-core copied to clipboard

ocel interleavings discovery throws TypeError

Open lmpeiris opened this issue 1 year ago • 6 comments

Follow the example in documentation for ocel interleavings discovery. https://pm4py.fit.fraunhofer.de/documentation#object-centric-event-logs using:

  • python 3.11 on windows 11 x64
  • pm4py 2.7.11.4
  • pandas 2.1.0
dataframe1 = pd.read_csv("tests/input_data/interleavings/receipt_even.csv")
dataframe1 = pm4py.format_dataframe(dataframe1)
dataframe2 = pd.read_csv("tests/input_data/interleavings/receipt_odd.csv")
dataframe2 = pm4py.format_dataframe(dataframe2)
case_relations = pd.read_csv("tests/input_data/interleavings/case_relations.csv")

from pm4py.algo.discovery.ocel.interleavings import algorithm as interleavings_discovery
interleavings = interleavings_discovery.apply(dataframe1, dataframe2, case_relations)

gives following error, with below: TypeError: Passing a set as an indexer is not supported. Use a list instead.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[6], [line 11](vscode-notebook-cell:?execution_count=6&line=11)
      [8](vscode-notebook-cell:?execution_count=6&line=8) interleavings = interleavings_discovery.apply(dataframe1, dataframe2, case_relations)
     [10](vscode-notebook-cell:?execution_count=6&line=10) from pm4py.objects.ocel.util import log_ocel
---> [11](vscode-notebook-cell:?execution_count=6&line=11) ocel = log_ocel.from_interleavings(dataframe1, dataframe2, interleavings)

File [c:\Users\malshan\AppData\Local\Programs\Python\Python311\Lib\site-packages\pm4py\objects\ocel\util\log_ocel.py:264](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:264), in from_interleavings(df1, df2, interleavings, parameters)
    [261](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:261) right_index = exec_utils.get_param_value(Parameters.RIGHT_INDEX, parameters, "@@right_index")
    [262](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:262) direction = exec_utils.get_param_value(Parameters.DIRECTION, parameters, "@@direction")
--> [264](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:264) events1 = __get_events_dataframe(df1, activity_key, timestamp_key, case_id_key, case_attribute_prefix,
    [265](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:265)                                  events_prefix="E1_")
    [266](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:266) objects1 = __get_objects_dataframe(df1, case_id_key, case_attribute_prefix, target_object_type)
    [267](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:267) relations1 = __get_relations_from_events(events1, target_object_type)

File [c:\Users\malshan\AppData\Local\Programs\Python\Python311\Lib\site-packages\pm4py\objects\ocel\util\log_ocel.py:144](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:144), in __get_events_dataframe(df, activity_key, timestamp_key, case_id_key, case_attribute_prefix, events_prefix)
    [140](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:140) """
    [141](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:141) Internal method to get the events dataframe out of a traditional log stored as Pandas dataframe
    [142](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:142) """
    [143](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:143) columns = {case_id_key}.union**(set(x for x in df.columns if not x.startswith(case_attribute_prefix)))**
--> [144](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:144) df = df[columns]
    [145](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:145) df = df.rename(columns={activity_key: ocel_constants.DEFAULT_EVENT_ACTIVITY,
    [146](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:146)                         timestamp_key: ocel_constants.DEFAULT_EVENT_TIMESTAMP,
    [147](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:147)                         case_id_key: ocel_constants.DEFAULT_OBJECT_ID})
    [148](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pm4py/objects/ocel/util/log_ocel.py:148) df[ocel_constants.DEFAULT_EVENT_ID] = events_prefix + df.index.astype(str)
...
   [2696](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pandas/core/indexing.py:2696)     raise TypeError(
   [2697](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pandas/core/indexing.py:2697)         "Passing a dict as an indexer is not supported. Use a list instead."
   [2698](file:///C:/Users/malshan/AppData/Local/Programs/Python/Python311/Lib/site-packages/pandas/core/indexing.py:2698)     )

lmpeiris avatar May 06 '24 13:05 lmpeiris

same reproduced using pm4py-2.7.11.8

lmpeiris avatar May 06 '24 13:05 lmpeiris

Dear @lmpeiris

At least with the current version of Pandas, such code runs without hassles

Could you update Pandas to 2.2.2 and retry ? (pip install -U pandas)

fit-alessandro-berti avatar May 08 '24 04:05 fit-alessandro-berti

same using pandas 2.2.2 is it similar to this issue (from another project) https://github.com/aertslab/pycistarget/issues/7

i will also try in python 3.10 and let you know

lmpeiris avatar May 08 '24 07:05 lmpeiris

A question: do you have NVIDIA RAPIDS / CUDF installed? CUDF is automatically privileged over Pandas if installed, and does not support set as indexers.

fit-alessandro-berti avatar May 08 '24 07:05 fit-alessandro-berti

I cheked they are not installed. also laptop do not have nvidia hardware. I'll be able to test this in python 3.10 in 8 hours, will post the result of it.

lmpeiris avatar May 08 '24 07:05 lmpeiris

works fine on python 3.10 win11 x64 (on a different machine) using same pandas and pm4py versions

lmpeiris avatar May 08 '24 14:05 lmpeiris

Despite not being able to reproduce your issue, I closed some of the possible root causes, which will make part of the next intermediate release of pm4py.

fit-alessandro-berti avatar May 13 '24 12:05 fit-alessandro-berti