Http HEAD method errors
Thank you for such a great project!
Describe the bug Using a http HEAD method with an expected status code always returns a down result, with no status code returned in the issue. It appears that this relates to the implementation of curl as the head method is different to most others in not returning a body.
To Reproduce Steps to reproduce the behavior:
- Create a new site using a method of HEAD with a defined expected status code. (I do not believe the latter is required.)
Expected behavior Site should be classified as up if the site returns the expected status code.
Additional context Example Config: https://github.com/SamPetherbridge/status.httpstatus.xyz/blob/master/.upptimerc.yml#L37 Example issue: #https://github.com/SamPetherbridge/status.httpstatus.xyz/issues/2
Yep, it seems from your links that all HEAD requests are marked as down for some reason. I'll have a closer look soon to see why this is happening.
Aah, this is interesting. It turns out that the HEAD verb only returns the response headers, not body. This is most likely why we get an error, because we expect a body in the response. I'll fix this!
Seems like the error is Error: Transferred a partial file. Now we need to see whether this is normal for all HEAD requests and catch it, or if it's a URL-specific issue.
Using HEAD on https://www.google.com is not throwing an error. Maybe this is instead a problem with httpstatus.xyz? It seems like it's passing a content size header different from the actual response size.
Odd, it appears the issue was indeed the content length header. It appears that Cloudflare Workers will set the content length even if no content returned as is the case with a HEAD request, not the expected behaviour at all. This is resolved from my perspective. Thank you for identifying this, I will raise a ticket with Cloudflare.
@AnandChowdhary Just circling back to this. As I went to submit a ticket to CF it appears that the behaviour of sending content related headers is allowed by the RFC.
The server SHOULD send the same header fields in response to a HEAD request as it would have sent if the request had been a GET, except that the payload header fields (Section 3.3) MAY be omitted.
While I have fixed my use case, there may be potential future edge-cases where this could be an issue.
I'd say it's best to stick to the expected curl behavior on this. By default, curl would throw an error if the content length is provided in HEAD and doesn't match the body length, so we shouldn't catch it.
If you feel strongly about this, perhaps we can open an issue or discuss this with the folks over at https://github.com/curl/curl and see if they'd want to catch this by default.
https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.4
The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response
Note that the HTTP method HEAD should return exactly the same headers of a GET request, this is the real purpose of the HEAD method, know the headers without receive the body.
I want to use upptime to check if large files are online without actually downloading them. To this I have to use the HEAD HTTP method. However this is not possible due to this bug, as I get Error: Transferred a partial file for every tested link/file (except google.com which does not set content-length).
However it is intended that HEAD actually sets the Content-Length, f.e. MDN: https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/HEAD
For example, if a URL might produce a large download, a HEAD request could read its Content-Length header to check the filesize without actually downloading the file.
Ok so apparently this has something to do with how the HTTP method is set. Currently, it is set via the CUSTOMREQUEST option, which however should not be used for methods like GET, POST, HEAD etc. but rather for really custom requests (https://curl.se/libcurl/c/CURLOPT_CUSTOMREQUEST.html).
Unfortunately this is not clearly worded in the docs for libcurl, but in the curl manpages (https://curl.se/docs/manpage.html#-X) it says:
Normally you don't need this option. All sorts of GET, HEAD, POST and PUT requests are rather invoked by using dedicated command line options. This option only changes the actual word used in the HTTP request, it does not alter the way curl behaves. So for example if you want to make a proper HEAD request, using -X HEAD will not suffice. You need to use the -I, --head option.
From libcurl doc:
When you change the request method by setting CURLOPT_CUSTOMREQUEST to something, you don't actually change how libcurl behaves or acts in regards to the particular request method, it will only change the actual string sent in the request.
Thus for HEAD this option should be set to true https://curl.se/libcurl/c/CURLOPT_NOBODY.html in order for curl to perform a HEAD request. (For GET: https://curl.se/libcurl/c/CURLOPT_HTTPGET.html, For POST: https://curl.se/libcurl/c/CURLOPT_POST.html)