BuildingMOTIF icon indicating copy to clipboard operation
BuildingMOTIF copied to clipboard

Automated Parser Generation and Serialization

Open dllliu opened this issue 1 year ago • 2 comments

  • Package parser/serialization/metrics work into SerializedParserMetrics class
  • Pre processing/clustering for building point labels with DBScan
  • Parsers dynamically ran via dynamic module loading at runtime to obtain emitted tokens
  • Multiple abbreviation list support (tools to merge/sort abbreviations)

Checklist

  • [x] Example notebook showcasing how to use
  • [x] Unit tests for SerializedParserMetrics class
  • [x] Detailed Documentation for relevant functions and SerializedParserMetrics class
  • [x] Exception Handling for file/io errors, not enough points to cluster, invalid llm output
  • [x] Parser and Serializer work has been combined

Notes

  • Schikit-Learn requires Python version of at least 3.9
  • Other added packages are: langchain, langchain-community, pyenchant, scikit-learn
  • Running all unit tests will take longer (SerializedParserMetrics class takes time to populate)

dllliu avatar Jul 09 '24 20:07 dllliu

Hey @dllliu and @TShapinsky -- does this incorporate changes from any outstanding PRs? I want to make sure my review sticks to @dllliu's code

gtfierro avatar Jul 10 '24 21:07 gtfierro

Hey @dllliu and @TShapinsky -- does this incorporate changes from any outstanding PRs? I want to make sure my review sticks to @dllliu's code

There's a cherrypicked commit from the create parser UI branch which fixes the serialization. Besides that, no

TShapinsky avatar Jul 11 '24 00:07 TShapinsky