HTTP (not HTTPS) response has no Content-Length
Steps to reproduce (tested on both mac and ubuntu):
-
go get github.com/google/martian/ && go install github.com/google/martian/cmd/proxy -
~/go/bin/proxy -
curl -v "http://www.example.com/" --proxy localhost:8080
Result: curl hangs forever.
lab@ubuntu:~$ curl -v "http://www.example.com/" --proxy localhost:8080 > /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 8080 (#0)
> GET http://www.example.com/ HTTP/1.1
> Host: www.example.com
> User-Agent: curl/7.58.0
> Accept: */*
> Proxy-Connection: Keep-Alive
>
< HTTP/1.1 200 OK
< Cache-Control: max-age=604800
< Content-Type: text/html; charset=UTF-8
< Date: Sun, 05 May 2019 15:22:16 GMT
< Etag: "1541025663+ident+gzip"
< Expires: Sun, 12 May 2019 15:22:16 GMT
< Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
< Server: ECS (dcb/7EA2)
< Vary: Accept-Encoding
< X-Cache: HIT
* no chunk, no close, no size. Assume close to signal end
<
{ [1270 bytes data]
100 1270 0 1270 0 0 172 0 --:--:-- 0:00:07 --:--:-- 0
Thanks for the repro! This looks bad, we'll have someone take a look soon.
I was unable to reproduce the issue, but I had to take the following steps because Martian wasn't set up.
- go get github.com/google/martian/
- go install github.com/google/martian/cmd/proxy
- go get golang.org/x/net/websocket
- go install github.com/google/martian/cmd/proxy
- ~/go/bin/proxy
- curl -v "http://www.example.com/" --proxy localhost:8080
And curl returns the expected response.
It does not terminate though. curl does not know when the response body is complete. Try this:
curl "http://www.example.com/" --proxy localhost:8080 > /dev/null 2> /dev/null && echo SUCCESS
It won't display SUCCESS. This one however, will display SUCCESS:
curl "http://www.example.com/" > /dev/null 2> /dev/null && echo SUCCESS
@itaisod What's the version of Go you're using?
1.12.5
After investigating, seems that example.com just doesn't respond with a Content-Length header, so Martian just faithfully proxies whatever it got. Try the same thing with another website without https, such as http://www.baidu.com/ and they return a Content-Length. Thanks for reporting this, and let us know if you find any more bugs in the future.
I dunno, to me it clearly looks like it's returning a Content-Length. But whatever.
You're right. During testing I somehow mistook the proxied output for the no proxy output and that was missing the Content-Length header. I'm starting to suspect there's some order dependency in the header processing, and perhaps anything after the X-Cache header is getting dropped, which in the case of example.com was the Content-Length header. Will take a closer look soon.
This is the behavior of Go's Transport, according to this issue.
If we want to retain the Content-Length header, we must set DisableCompression of http.Transport to true.
I want to create a PR, but should I add the Accept-Encoding: gzip header for every request like Go do, or I just let users decide if they want gzip or not?
I don't think that setting DisableCompression to true should be the
default behavior. If you need this behavior, I would suggest writing your
own main.go implementation and swapping in your own http.Transport with
SetRoundTripper like here:
https://github.com/google/martian/blob/master/cmd/proxy/main.go#L270
On Tue, Jun 16, 2020 at 11:49 PM Thanh Nguyen [email protected] wrote:
This is the behavior of Go's Transport, according to this issue https://github.com/golang/go/issues/18132.
If we want to retain the Content-Length header, we must set DisableCompression of http.Transport to true.
I want to create a PR, but should I add the Accept-Encoding: gzip header for every request like Go do, or I just let users decide if they want gzip or not?
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/google/martian/issues/288#issuecomment-645130469, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMR6NAUWS3LXPIBB3NAGDRXA4NZANCNFSM4HK3P3XA .
So we just leave this bug there?
If martian can't provide Content-Length header in the response, at the very least it should close the connection when it's done and not just leave it hanging forever... Otherwise, it can either pipe the original data unmodified along with the original Content-Length and Content-Encoding, or read the whole data into a buffer first, and then set Content-Length in the response to whatever size it wants, compressed/uncompressed.
I'm 100% sure it's fixable. Not that I mind though, I'm not using martian.
Yeah I think @itaisod is correct. The fix is to "pipe the original data unmodified along with the original Content-Length and Content-Encoding". My concern with setting DisableCompression to true is that it would be decompressed for all requests - regardless of whether they want it or not.
My concern with setting
DisableCompressiontotrueis that it would be decompressed for all requests - regardless of whether they want it or not.
Setting DisableCompression to true will not disable compression for all request. Instead, it will respect the client's request: if the request contains the Accept-Encoding: gzip header, the response would be compressed, and then it's the client's responsibility to decompress the gzip response. The DisableCompression actually means DisableForcingCompression.
The fix is to "pipe the original data unmodified along with the original Content-Length and Content-Encoding".
With setting DisableCompression to false, all requests would be compressed by internal go package. go will do the decompression for us internally if the original request doesn't contain the Accept-Encoding: gzip header, so the original Content-Length and Content-Encoding would be incorrect, that's why they remove them.
I thinks that martian, as a proxy, should respect the client's request and let them decide if they want the Accept-Encoding: gzip header or not.