bbot
bbot copied to clipboard
404 pages are considered "up" in some cases
If we get a url_unverified event for a page, normally if it turns out to be a 404 httpx wont report it as a URL event.
However, if we instead get a 301, httpx will report it is live.
Usually this is desirable, however there are cases where we hit the http version of the site, and we are redirected to https regardless of what page is requested. From the perspective of httpx module the url was a 301, but we ultimately ended on a 404.
The solution might be to use the code of whatever the last item in the redirect chain is to make those decisions.