FSharp.Data icon indicating copy to clipboard operation
FSharp.Data copied to clipboard

Html Parser: `Invalid css selector syntax` exception thrown for css pseudo-classes

Open EverybodyKurts opened this issue 4 years ago • 2 comments

I'm trying to select the first paragraph in a group of sibling paragraphs. According to the MDN nth-child() entry, I should be able to do this using :nth-child().

However, this results in a syntax error when using HtmlDocument.CssSelect:

> node.CssSelect("p:nth-child(1)");;
System.Exception: Invalid css selector syntax (char ':' at offset 0)
   at FSharp.Data.HtmlCssSelectors.tokenize'@104(CssSelectorTokenizer this, FSharpList`1 acc, FSharpList`1 sourceChars)
   at FSharp.Data.HtmlCssSelectors.CssSelectorTokenizer.tokenize()
   at FSharp.Data.HtmlCssSelectors.CssSelectorTokenizer.Tokenize(String pCssSelector)
   at FSharp.Data.HtmlNodeModule.Select(IEnumerable`1 nodes, String selector)
   at <StartupCode$FSI_0029>.$FSI_0029.main@()

EverybodyKurts avatar May 10 '21 00:05 EverybodyKurts

I see that in http://fsprojects.github.io/FSharp.Data/library/HtmlCssSelectors.html#Implemented-and-missing-features it's listed as a todo. Perhaps instead of a general System.Exception being thrown, a NotImplementedException is thrown?

EverybodyKurts avatar May 10 '21 11:05 EverybodyKurts

For future reference (notably me), the tokenizing process happens in this function:

https://github.com/fsprojects/FSharp.Data/blob/main/src/Html/HtmlCssSelectors.fs#L103

EverybodyKurts avatar May 31 '21 15:05 EverybodyKurts