http icon indicating copy to clipboard operation
http copied to clipboard

Support streaming bodies in the client

Open evert opened this issue 11 years ago • 24 comments

Moved from here:

https://github.com/fruux/sabre-dav/issues/321

evert avatar Sep 15 '14 21:09 evert

Hello :-),

The Sabre\HTTP\Client::parseCurlResult method computes an array containing the response index. This index contains a Sabre\HTTP\Response object. This object extends Sabre\HTTP\Message. On this object, we have the getBody, getBodyAsStream and getBodyAsString methods.

So, because this response from Sabre\HTTP\Client::parseCurlResult is returned by the doRequest method, basically, this feature is already supported.

QED ■.

Hywan avatar Jan 12 '15 13:01 Hywan

The point of this ticket is, that someone may be using the library to do downloads of very large files.

In those cases, we want to ensure that the entire file is accessed as a stream, and never placed into memory.

So while it's possible right now to convert the string into a stream, the goal is to change the client a bit so it uses a stream under the hood as well.

evert avatar Jan 12 '15 15:01 evert

@evert But the result is already in memory because cURL gives you all the “stuff”: https://github.com/fruux/sabre-http/blob/c55cbc1daa91293cda92ea4b90de79c743c4a149/lib/Client.php#L483. I will check if cURL can gives only few informations in order to create a stream.

Hywan avatar Jan 12 '15 15:01 Hywan

An interesting link from a friend of me @pmartin: http://stackoverflow.com/questions/1342583/manipulate-a-string-that-is-30-million-characters-long/1342760#1342760. However, I don't know how it works if we don't want to load the response yet we receive it but later: When reading the stream only.

Hywan avatar Jan 12 '15 15:01 Hywan

In case where the user has a stream ready and with the write permission, she can give this stream to the HTTP client and the response will be copied into this stream. This answer to one use-case.

Hywan avatar Jan 12 '15 15:01 Hywan

streams in request bodies already 100% work, this is about turning a HTTP response into a stream.

evert avatar Jan 12 '15 15:01 evert

@evert How does it work. I missed it in the source code?

Hywan avatar Jan 12 '15 15:01 Hywan

https://github.com/fruux/sabre-http/blob/master/lib/Client.php#L405

evert avatar Jan 12 '15 15:01 evert

@evert Yup, it's for sending a request. What you would like to do is for receiving a response, right?

Hywan avatar Jan 12 '15 15:01 Hywan

Indeed, yes!

evert avatar Jan 12 '15 15:01 evert

Any news on this? I am very interrested in using streams as im going to down/upload large files >2GB with the webdav client.

h44z avatar Apr 12 '15 00:04 h44z

@h44z Not from me yet.

Hywan avatar Apr 12 '15 08:04 Hywan

This is still something that interesting for us, but we haven't had time implementing it yet.

evert avatar Apr 13 '15 15:04 evert

So the problem here is that curl actually doesn't have an easy way for us to just access the stream resource, as far as I can tell.

The only way we can progressively get access to the stream, is by using the CURLOPT_WRITEFUNCTION option, but that only gives us 'bits of the string' as opposed to a full-on stream.

With that function, we could send everything to a temporary stream (php://temp/) which would cache the result in memory, but that only solves part of the problem.

Ideally we'd want the response to return as soon as it starts coming in and not after all the bytes have arrived, and ideally we would want to not have to cache/buffer it anywhere.

I don't see an easy way to do that.

evert avatar May 19 '15 09:05 evert

CURLOPT_WRITEFUNCTION seems to be invoked when a certain sized chunk was downloaded by CURL. So it looks more or less like a "real" stream.

Which parts of the problem doesnt this solve for you, could you elaborate?

staabm avatar May 19 '15 09:05 staabm

Well, it would be really nice if we can do something like :

$request = "...";
$response = $client->send($request);
stream_copy_to_stream('php://output', $response->getBodyAsStream());

This should:

  1. Use real stream resources internally
  2. Never buffer anything, anywhere.

evert avatar May 19 '15 10:05 evert

Absolutely. So we need to change from using a string and use e.g. a php tmp stream in https://github.com/fruux/sabre-http/blob/master/lib/Client.php#L496

this will be not "real" streaming but it will be way better as what we have ATM. whats the actual shop stopper? Did I miss something?

staabm avatar May 19 '15 10:05 staabm

The problem with using the temporary stream is that it only partially solve the goals. It does not:

  1. Avoid a cache/buffer. The entire response will be stored in memory or on disk.
  2. We can't start reading until the entire response is in, because we can't write and read the string at the same time.

evert avatar May 19 '15 11:05 evert

It's better than nothing though.

evert avatar May 19 '15 11:05 evert

Avoid a cache/buffer. The entire response will be stored in memory or on disk.

right, but thats how streaming works nevertheless... php://memory seems like a good fit for that.

We can't start reading until the entire response is in, because we can't write and read the string at the same time.

hm IIRC we could do this using a non-blocking read/write stream, couldn't we?

staabm avatar May 19 '15 11:05 staabm

maybe it would also be a good occasion to use a different http client, e.g. https://github.com/amphp/artax

(so we dont need to workaround curl limitations)

staabm avatar May 19 '15 12:05 staabm

right, but thats how streaming works nevertheless... php://memory seems like a good fit for that.

That's not really true... If I didn't use curl and used PHP's built-in HTTP stream wrappers, there would be no buffer.

Here's an example (and note that I did stream_copy_to_stream() wrong in my previous code snippet, sorry about that):

stream_copy_to_stream(
   fopen('http://example.org/','r'),
   fopen('php://output','w')
);

If I do the same with a temporary stream, it would look more like this:


$tmp = fopen('php://temp','r+');

stream_copy_to_stream(
   fopen('http://example.org/','r'),
   fopen($tmp,'w')
);
rewind($tmp);

stream_copy_to_stream(
   $tmp,
   fopen('php://output','w')
);

This last example has two passes, and requires a buffer (disk or memory depending on the size) and this is exactly how the curl example with WRITEFUNCTIONwould work as well. This is far from ideal. The use-case I want to solve is indeed the '2GB download' use-case, and if I force people to store the entire thing on disk first that would be sub-optimal.

We can't start reading until the entire response is in, because we can't write and read the string at the same time.

hm IIRC we could do this using a non-blocking read/write stream, couldn't we?

Not at the same time, and not without a buffer. Perhaps with steam_select() and mkfifo() :)

maybe it would also be a good occasion to use a different http client, e.g. https://github.com/amphp/artax

Not a bad idea =)

evert avatar May 19 '15 12:05 evert

I think the only really good solution will be a different http client. With artax you can even use single threaded concurrency in case some requests can be made in parallel which could be another perf. win.

staabm avatar May 19 '15 14:05 staabm

I'll definitely look into it. My preference would go for something lightweight, so maybe artax is that =)

In the future I want to kick of sabre/davclient again, so that will be good timing to dig into that.

evert avatar May 19 '15 14:05 evert