pm4py-core icon indicating copy to clipboard operation
pm4py-core copied to clipboard

[Feature] XES-YAML - XES format extension

Open ondrisvu opened this issue 1 year ago • 1 comments

Description: This PR introduces XES-YAML functionality, which serves as an extension for the supported XES log formats. The backward-compability with the existing pm4py’s implementation of the XES importer/exporter remains. This implementation was developed as part of a Bachelor's Thesis titled "More Efficient eXtensible Event Stream-XES-YAML-Format" at TUM.



Key Additions:

  1. XES-YAML Import/Export
    • read_yaml() function to parse XES-YAML files into pm4py’s native XES structure
    • write_yaml() function to serialize data into standards-compliant XES-YAML
  2. Illustrative Example
    • a sample event log in the XES-YAML log format, demonstrating:
      • representation of log, trace, event, and attribute structures
      • the sample consists of snippets from real-life .xes.yaml event logs
  3. Comprehensive Test Suite
    • conversion tests (correctness assurance), covering:
      • XES-XML to XES-YAML
      • XES-YAML to XES-XML
      • The aforementioned conversion tests ensure data integrity and format preservation across various conversions.
    • roundtrip tests
      • XES-YAML to XES-YAML
      • Roundtrip tests ensure log consistency between logs before/after export.

Backward Compability Focus:

  • seamless integration with current systems and data pipelines

Used libraries:

  • pyyaml (https://pyyaml.org/wiki/PyYAML)
  • (Optional) pyyaml with LibYaml C-Bindings (https://pyyaml.org/wiki/LibYAML) - for a faster implementation of the pyyaml parser

ondrisvu avatar Mar 09 '24 09:03 ondrisvu

Thanks. This is a contribution that from what I understood is going to be part of a larger set of PRs.

We would need to finalize a "contribution level agreement" before merging (as the contribution is non-trivial)

fit-alessandro-berti avatar Mar 11 '24 12:03 fit-alessandro-berti