Return only fragment of page
Had the idea of parsing a page and pulling out only a specific comment and how would that work (assuming it isn't posted from somewhere else). The idea would be to give a URL that has a fragment and the result items would contain anything from that id and below.
Would have to look at how this would work exactly, would likely need the whole page for rels and base and such.
I chose to do this as part of my Microformats consuming code, XRay, rather than at the parser level. XRay first parses the HTML document to extract the node at the matching fragment, then it passes that HTML to the parser.
Doesn't this break things like
Not sure what you mean "things like tags". Here's what it does: https://github.com/aaronpk/XRay/blob/master/lib/XRay/Formats/HTML.php#L82
Basically if a fragment is included, it runs $doc->saveHTML on that element and replaces the HTML that it fetched with the HTML from inside the HTML tag with that ID.
lol... well then, github processes this as html.... things like <base> tags