tests [WIP] Unit tests

This patch constitutes a series of generic unit tests for microformat parsers, using the existing test format. While it is not possible to segregate certain parsing features entirely (notably implied property parsing), I've attempted to keep to the following design philosophy:

each file tests a single feature or facet of parsing, as much as possible
types of microformats and names of properties avoid known vocabularies to emphasize parsing is generic
no two microformats in a given file produce the same output, to make output comparsison easier
example.test is used instead of example.com, as the former is guaranteed never to be a real domain
tests try to be as thorough as practical
standard features which have experimental alternatives are tested in isolated files so that they may be skipped easily
under- or un-specified behaviour is tested in isolated files prefixed with tentative- so that they may be easily skipped; these files reflect common behaviour among established implementations where there is majority agreement
leave existing tests alone to validate that these tests do not contradict existing tests

At present the test suite offers only partial coverage. The to-do list is thus:

[x] Microformat type and property name splitting/matching
[x] Basic property parsing
[ ] ID parsing
[ ] textContent parsing
[x] Value class pattern parsing
[ ] Value-title parsing
[ ] Value class date parsing
[x] Implied property parsing
[x] Nested microformat parsing
[ ] Link relation parsing
[ ] Template handling
[ ] Foreign content handling
[ ] URL resolution
[ ] URL normalization
Backcompat processing
- vocabularies
  - [ ] adr
  - [ ] vcard
  - [ ] vevent
  - [ ] hfeed
  - [ ] hentry
  - [ ] geo
  - [ ] hproduct
  - [ ] hrecipe
  - [ ] hresume
  - [ ] hreview
  - [ ] hreview-aggregate
  - [ ] hnews (?)
- [ ] includes
- [ ] multi-roots
- [ ] mixed v2/backcompat
- [ ] mixed v2 implied/backcompat
Experimental features
- [ ] lang parsing
- [ ] alternate textContent parsing
- [ ] Universal date parsing
- [ ] Universal implied date
- [ ] Implied time zone
- [ ] Ignoring Tailwind types
- [ ] srcset parsing

I am posting this while still far from finished to gather feedback early. I expect it will take me quite a while to do everything, but what's here can already be useful to implementers.

Jul 09 '23 02:07 JKingweb

I have no objections to merging the work thus far as a partial test suite, though I should probably write up some draft documentation (more or less what's detailed in the cover of this request) first. Is there a preferred format? Plain text, markdown, HTML, something else?

Jul 18 '23 22:07 JKingweb

Personally I think a markdown file containing basically what is in the PR description would be nice. Linked to from README, I would say, and maybe even at the root level of the repository. But there does not seem to be any precedent.

There are some changelog files, but honestly I have never read them, in part because they are HTML files. That makes it basically a requirement to clone the repo and open it with a browser to read comfortably. That is why I would prefer markdown, GitHub has native support for it, and it will be easier to refer to it inside the repo.

Nov 05 '23 10:11 Zegnat

I've been using https://keepachangelog.com/ format recently on some projects and liking it.

Nov 22 '23 21:11 gRegorLove