codex icon indicating copy to clipboard operation
codex copied to clipboard

Fix reversed branches in parsnip's `scan`

Open ArturGajowy opened this issue 4 years ago • 0 comments

Hi!

I've noticed the conditions passed to while and till seem to need to be inverted - but then in some cases the parsing would never stop till a SEGFAULT was reached.

This fixes the need to invert conditions in while, till and other dependents, and removes the SEGFAULT. See the added tests:

    it "while"       $ parse (while ('A' ==)) "AAB"   `shouldBe` Right "AA"
    it "while edge"  $ parse (while ('B' ==)) "AAB"   `shouldBe` Right ""
    it "while NUL"   $ parse (while ('A' ==)) "A\0AB" `shouldBe` Right "A"

To further test the library, and validate my original goal, I've ported most of cassava to use parsnip instead of attoparsec, see: https://github.com/ArturGajowy/parsnip-cassava/commits/master

All tests passed, save for '\0' in input (those were changed), and the Streaming and Incremental APIs (didn't port those, disabled the tests).

Somehow the benchmarks did not show improvement - rather a regression in some cases. Possible reasons I see:

  • I botched the porting :)
  • I need to toStrict the input ByteString, as cassava parses lazy ByteString-s
  • cassava's benchmark input is very small (43 rows, 3 columns)

For use as a cassava backend, one would further need:

  • a way to fail with an error message
  • maybe a wittier name than parsnip-cassava
  • preferably understanding of why the following fails:
    -- this works
    it "while"       $ parse (while ('A' ==)) "AAB"   `shouldBe` Right "AA"
    -- this fails (!), evaluates to `Right ""`
    it "while"       $ parse (while (== 'A')) "AAB"   `shouldBe` Right "AA"

I'd love to hear your thoughts :) Hope the PR is helpful! :)

ArturGajowy avatar Jun 08 '21 13:06 ArturGajowy