[WIP] Unit tests
This patch constitutes a series of generic unit tests for microformat parsers, using the existing test format. While it is not possible to segregate certain parsing features entirely (notably implied property parsing), I've attempted to keep to the following design philosophy:
- each file tests a single feature or facet of parsing, as much as possible
- types of microformats and names of properties avoid known vocabularies to emphasize parsing is generic
- no two microformats in a given file produce the same output, to make output comparsison easier
-
example.testis used instead ofexample.com, as the former is guaranteed never to be a real domain - tests try to be as thorough as practical
- standard features which have experimental alternatives are tested in isolated files so that they may be skipped easily
- under- or un-specified behaviour is tested in isolated files prefixed with
tentative-so that they may be easily skipped; these files reflect common behaviour among established implementations where there is majority agreement - leave existing tests alone to validate that these tests do not contradict existing tests
At present the test suite offers only partial coverage. The to-do list is thus:
- [x] Microformat type and property name splitting/matching
- [x] Basic property parsing
- [ ] ID parsing
- [ ]
textContentparsing - [x] Value class pattern parsing
- [ ] Value-title parsing
- [ ] Value class date parsing
- [x] Implied property parsing
- [x] Nested microformat parsing
- [ ] Link relation parsing
- [ ] Template handling
- [ ] Foreign content handling
- [ ] URL resolution
- [ ] URL normalization
- Backcompat processing
- vocabularies
- [ ] adr
- [ ] vcard
- [ ] vevent
- [ ] hfeed
- [ ] hentry
- [ ] geo
- [ ] hproduct
- [ ] hrecipe
- [ ] hresume
- [ ] hreview
- [ ] hreview-aggregate
- [ ] hnews (?)
- [ ] includes
- [ ] multi-roots
- [ ] mixed v2/backcompat
- [ ] mixed v2 implied/backcompat
- vocabularies
- Experimental features
- [ ]
langparsing - [ ] alternate
textContentparsing - [ ] Universal date parsing
- [ ] Universal implied date
- [ ] Implied time zone
- [ ] Ignoring Tailwind types
- [ ]
srcsetparsing
- [ ]
I am posting this while still far from finished to gather feedback early. I expect it will take me quite a while to do everything, but what's here can already be useful to implementers.
I have no objections to merging the work thus far as a partial test suite, though I should probably write up some draft documentation (more or less what's detailed in the cover of this request) first. Is there a preferred format? Plain text, markdown, HTML, something else?
Personally I think a markdown file containing basically what is in the PR description would be nice. Linked to from README, I would say, and maybe even at the root level of the repository. But there does not seem to be any precedent.
There are some changelog files, but honestly I have never read them, in part because they are HTML files. That makes it basically a requirement to clone the repo and open it with a browser to read comfortably. That is why I would prefer markdown, GitHub has native support for it, and it will be easier to refer to it inside the repo.
I've been using https://keepachangelog.com/ format recently on some projects and liking it.