Blacklisting titles
This might be a controversial issue but clearly, there is a problem with this in the Wikimedia stats. Usually, these cases can be detected by looking at mobile/desktop page view proportion (one of them heavily dominates).
Currently, there is an article [1] that gets quite constant 800 desktop hits daily [2] and that is enough to get in lvwiki top3 most days. In the past, there was some pr0n site article doing the same thing in enwiki.
[1] https://lv.wikipedia.org/wiki/Karless_Pud%C5%BEdemons [2] https://pageviews.toolforge.org/?project=lv.wikipedia.org&platform=all-access&agent=user&redirects=0&range=latest-20&pages=Karless_Pud%C5%BEdemons
Should there be a user maintained blacklist for hatnote top?
We excluded one page from enwiki based on signs of inauthentic traffic—I think that's the other site you mentioned. See get_data.py#L77. We could add more pages to the list where the traffic looks artificial or the page doesn't belong.
If we factor it out as a user-maintained blacklist, I think it would be good to explain our exclusion methodology for the sake of transparency. I'd also love if there was a way to automatically flag (and possibly exclude?) pages based on signs of bot traffic, like an unusual pattern of mobile/desktop traffic.