codeigniter-redis icon indicating copy to clipboard operation
codeigniter-redis copied to clipboard

`_single_line_reply()` is truncating large strings

Open ghost opened this issue 13 years ago • 30 comments

While I was retrieving some JSON (rather huge sets) I noticed that _single_line_reply() was returning only a (large) random chunk of what it should. The chunk was usually 8192 bytes, but sometimes 16k. The same thing happened with base64 encoded data larger than ~32k .

Identical behavior happens when data is just serialized prior to setting. Yet, a GET from redis console produces the entire thing perfectly.

My understanding of fgets() is the same as in the documentation:

Reading ends when length - 1 bytes have been read, on a newline (which is included in the return value), or on EOF (whichever comes first). If no length is specified, it will keep reading from the stream until it reaches the end of the line.

Try it with any large serialized array ( > 8k) e.g.

$redis->set('foo', serialize($big_array));

Then observe what you see via $redis->get('foo') vs what you see via GET foo in the console. I checked the points at which it chomped the data and there was no newline (escaped or otherwise) at that point. Always at 8k or 16k.

Can anyone reproduce this?

ghost avatar Sep 25 '12 07:09 ghost

To narrow it down a bit more, any string over 20k seems to be doing this.

tinkertim avatar Sep 25 '12 11:09 tinkertim

Here is a gist with a test case. Once I hit past ~16k, _single_line_reply() (or rather, fgets()) starts truncating the result, and I can't figure out why. Note in the gist I have it named zredis, It's the library I use while I phase out use of the old version of the library (lots of places to not deal with spaces anymore).

tinkertim avatar Sep 25 '12 14:09 tinkertim

Weird stuff, got the same thing here:

Read payload of 15872 bytes
Writing payload of 16384 bytes
Read payload of 16376 bytes
Did not read the expected amount of bytes

joelcox avatar Sep 25 '12 14:09 joelcox

fsockopen('unix:///tmp/redis.sock') 

doesn't do the trick either.

joelcox avatar Sep 25 '12 14:09 joelcox

Found some info:

I'm extremely happy to say I've now got it working, and it's everything I dreamed it could be. The major bug stopping me was truncation of values more than 16k in size, and that turns out to be a bug in the Net_TokyoTyrant PHP wrapper (and arguably a bug in PHP's libraries). The wrapper was using a single fread() call to get values, but this has size limits, and so needs to be called repeatedly in a loop to get the full result. Jeremy Hinegardner got me attacking this again after he confirmed he was using Tokyo successfully through Ruby, and after some debugging made me suspicious of fread's reliability Blagovest Buyukliev's post confirmed it was the cause, and gave me a drop-in fix.

http://petewarden.typepad.com/searchbrowser/2009/06/how-to-get-tokyo-tyrant-working-in-php.html

joelcox avatar Sep 25 '12 14:09 joelcox

What version of PHP did you run that under? I'm using 5.4.5, about to try it on 5.3

tinkertim avatar Sep 25 '12 14:09 tinkertim

PHP 5.3.15 with Suhosin-Patch (cli) (built: Aug 24 2012 17:45:44) 

joelcox avatar Sep 25 '12 14:09 joelcox

Well, this is definitely not a bug in the library. I'm going to (tomorrow) rule out using the loopback device (127.0.0.1) and bind a few new servers to a public IP. I am completely baffled, so are a lot of other people (the PHP folks in SO chat who helped trying to narrow this down). It's just so odd that 16k is a 'breaking point' in a few places.

tinkertim avatar Sep 25 '12 15:09 tinkertim

Great. Keep us posted.

joelcox avatar Sep 25 '12 16:09 joelcox

No, it's the same over all interfaces. I'm out of ideas here. I have tried every possible way of reading from a descriptor opened with fsockopen(). fgets() truncates, fread() blocks indefinitely (strace shows poll() not firing while there is still ready data on the wire) and the stream family exhibits the same behavior as fgets(). I even tried setting the socket to nonblock and reading in very small chunks, a do { } that breaks if the read length is zero, and even some magic pixie dust. I even tried fscanf(), dies right around 16k.

I'm going to see if I can reproduce this with Predis and if not, examine how they get around it. So odd that I can send, but not receive 16k over the wire. So very very odd.

tinkertim avatar Sep 26 '12 03:09 tinkertim

I played around with stream_set_chunk_size() and stream_set_read_buffer(). Tweaking these, I was able to get (reliably) up to 40kb read before that quirk kicked in. My _single_line_reply() function now looks like this:

  private function _single_line_reply()
  {
      stream_set_read_buffer($this->_connection, 0);
      stream_set_chunk_size($this->_connection, 131072);
      return trim(fgets($this->_connection));
  }

Note, that is an ugly hack, I'm setting the chunk size to 128kb and turning off buffered reads.

Predis uses sockets, not streams and doesn't suffer from this problem (at least for a single connection, I don't know what it does for sharding). It seems like we have two options here:

  1. Change the code to use sockets
  2. Deal with a 16k barrier

I thought there was a way to tell a stream to give up the underlying socket, so that the socket family of functions could be used just in _single_line_reply() (the only place where we DON'T know how much we have to read prior to reading), however there seems to be no such way. We have to use either sockets or stream resources.

The tweaks I made to the chunk and buffer size eliminate my immediate problem. Do we want to just use sockets and not deal with the quirks associated with streams? That's a change I'm willing to write over a weekend. I don't see myself dealing with data over 40k serialized, so the immediate pressure is off.

Also, worth writing a (proper) test for this (up to 512k) and just let it fail for now?

Let me know one way or another.

tinkertim avatar Sep 26 '12 07:09 tinkertim

Well, that kludge did not work out so well. I'm going to add a sockets version of the library to my fork and get it working, then send a pull request for testing (Sockredis). I had to wrap Predis to actually get this done, and that's just so much code that I don't need. In fact, I'm not even committing it. Predis is great, but I don't need that much code.

This falls right on PHP's doorstep. I'm going to be raising a bit of a stink about it soon. They need to either fix or document this quirk. In the mean time, if you do socket programming, use sockets, not streams if you like your ass un-bitten :)

Edit

There is a way to get to the underlying socket for that single method. Trying it tomorrow.

tinkertim avatar Sep 26 '12 14:09 tinkertim

Ha! Thanks for your thorough investigation Tim. Looking forward to your fork and pull.

joelcox avatar Sep 26 '12 14:09 joelcox

O.k., socket_import_stream() is probably not going to work for us, because we have to read from the socket prior to importing it, which causes it to error due to losing bytes during conversion. E.g:

socket_import_stream(): 6 bytes of buffered data lost during stream conversion!

I tried setting the read buffer to 0 (none) with stream_set_buffer() prior to initially reading, which causes the conversion to work, but also socket_read() to block waiting for data despite setting the read buffer back to what it was immediately after conversion (and tried directly on the socket too via socket_set_option). Sockets in PHP are a mess.

I'm going to try a few other things. Here's also a proper test that can be used to reproduce the problem conventionally.

What I have so far is this (assume $this->sock is now a class variable)

First, we have to turn off buffering while we read the first bytes to determine the reply type:

    private function _read_request()
    {
            if ($this->sock === NULL)
            {
                    stream_set_read_buffer($this->_connection, 0);
            }
            $type = fgetc($this->_connection);
            if ($this->debug === TRUE)
            {
                    log_message('debug', 'Redis response type: ' . $type);
            }

            switch ($type)
            {
                    case '+':
                            return $this->_single_line_reply();
                            break;
                    case '-':
                            return $this->_error_reply();
                            break;
                    case ':':
                            return $this->_integer_reply();
                            break;
                    case '$':
                            return $this->_bulk_reply();
                            break;
                    case '*':
                            return $this->_multi_bulk_reply();
                            break;
                    default:
                            return FALSE;
            }
    }

Then, we actually get a reference to the socket in _single_line_reply(), and set the buffer back:

    private function _single_line_reply()
    {
            if ($this->sock === NULL)
            {
                    $this->sock = socket_import_stream($this->_connection);
                    stream_set_read_buffer($this->_connection, 4096);
            }
            if (! $this->sock) {
                    log_message('error', 'Unable to obtain socket from connection stream in single line reply');
                    return NULL;
            }
            $value = '';
            while (($buff = socket_read($this->sock, 1024)) != "")
            {
                    $value .= $buff;
            }
            return trim($value);
    }

I have tried that with and without sock_set_option() to do the same .. and it just blocks forever. I have also tried various settings for the read buffer and the chunk size we read, even reading one byte at a time blocks.

Hopefully I'm able to get it working. Setting the socket to nonblock just results in a ton of -EAGAIN from recvfrom (as shown via strace), and it stops reading prematurely.

Note, I began with just getting the socket in the constructor, and tried the current (mess) after experiencing the same issue.

tinkertim avatar Sep 30 '12 12:09 tinkertim

I solved the blocking issue. Actually, I have it just about working. A normal socket read in PHP uses the flag PHP_BINARY_READ, which takes its cues on where to stop from the system's recv() function. There is also PHP_NORMAL_READ, which is defined as:

PHP_NORMAL_READ - reading stops at \n or \r.

But it doesn't, and there are open bugs going back to 2009 about that. Using strpos() to find if \r is in the chunk we just read works, and we only examine 1k at a time.

However, I did get one random failure. I'm going to clean it up a bit and try running 100 tests to see what happens. I moved getting the reference to the socket and associated sanity checking back into the constructor, so the only places with modifications are the constructor and _single_line_reply() functions. I'll have a pull request out in a while.

tinkertim avatar Sep 30 '12 12:09 tinkertim

It's .. still breaking at around 16k randomly. I was close to happy, I ran tests (repeating 100x each) half a dozen times, and then I get:

Failed asserting that 16384 matches expected 25600.

... just before I was about to commit.

Something about a socket created as a stream does something odd, I even tried tweaking the receive buffer higher. I'll work on an alternate universe version of the library just using sockets in the development branch of my fork. Do you want a pull request for the test? It's just going to fail repeatedly until this is fixed.

Edit

I just don't get it, I ran tests 30 more times, all pass with flying colors. Ten minutes later I ran them again, failed on the third attempt. So I went sadistic on it:

phpunit --repeat 1000 --stop-on-failure
PHPUnit 3.6.10 by Sebastian Bergmann.

Configuration read from /home/.../codeigniter-redis/phpunit.xml

................................................................... 67 / 4 (1675%)
................................................................... 134 / 4 (3350%)
................................................................... 201 / 4 (5025%)
................................................................... 268 / 4 (6700%)
................................................................... 335 / 4 (8375%)
................................................................... 402 / 4 (10050%)
................................................................... 469 / 4 (11725%)
................................................................... 536 / 4 (13400%)
................................................................... 603 / 4 (15075%)
................................................................... 670 / 4 (16750%)
................................................................... 737 / 4 (18425%)
................................................................... 804 / 4 (20100%)
................................................................... 871 / 4 (21775%)
................................................................... 938 / 4 (23450%)
................................................................... 1005 / 4 (25125%)
................................................................... 1072 / 4 (26800%)
................................................................... 1139 / 4 (28475%)
................................................................... 1206 / 4 (30150%)
................................................................... 1273 / 4 (31825%)
................................................................... 1340 / 4 (33500%)
................................................................... 1407 / 4 (35175%)
....F

Time: 02:24, Memory: 3.50Mb

There was 1 failure:

1) RedisTest::test_chunk_reads
Failed asserting that 16000 matches expected 30208.

1, 10, 100 or 1000 times, it just randomly fails. It's not redis, the data is there. I'll just work on converting to sockets.

Why does PHP hate 16k so much?

tinkertim avatar Sep 30 '12 13:09 tinkertim

If it's not too much work that would be great. Hang in there :)!

joelcox avatar Sep 30 '12 14:09 joelcox

It should not be to difficult to just use sockets. I've been meaning to send a change to (optionally) just use a domain socket anyway.

This is just beyond nuts. I'll try to get it done over the coming weekend.

tinkertim avatar Sep 30 '12 14:09 tinkertim

Less-hacky fix until a "real" fix gets put in place: (this is a git diff for anyone who wants to implement)

[code] @@ -242,13 +242,25 @@ class Redis { $value_length = (int) fgets($this->_connection); if ($value_length <= 0) return NULL;

  •           $response = rtrim(fread($this->_connection, $value_length + 1));
    
  •           $response = $this->_read_line( $value_length );
            fgets($this->_connection);                      // Get rid of the \n\r
    
            return isset($response) ? $response : FALSE;
    
    }
    
  •   private function _read_line( $value_length )
    
  •   {
    
  •           $response = '';
    
  •           $remaining_length = $value_length;
    
  •           while(strlen($response) < $value_length) {
    
  •                   $line = fread($this->_connection, $remaining_length);
    
  •                   $remaining_length -= strlen($line);
    
  •                   $response .= $line;
    
  •           }
    
  •           return $response;
    
  •   }
    
  • /**
     \* Multi bulk reply
     *
    
    [/code]

(github is editing my response and turning the + into a dot)

davemoki avatar Jan 09 '13 04:01 davemoki

Thanks Dave!

joelcox avatar Jan 09 '13 07:01 joelcox

slightly more elegant version...

private function _read_line( $length )
{
    $data = '';
    while($length > 0) {
        $buffer = fread($this->_connection, $length);
        $length -= strlen($buffer);
        $data .= $buffer;
    }
    return $data;
}

davemoki avatar Jan 09 '13 19:01 davemoki

Does this need closed? I think we figured this problem out with the multi_reply section. It would probably be easy to move that fix over to here if this is still an actual issue.

integrii avatar Aug 22 '13 18:08 integrii

Eric, my gut says yes, but this issue has been so random that I'd prefer some confirmation from others first.

On Aug 22, 2013, at 8:04 PM, Eric Greer [email protected] wrote:

Does this need closed? I think we figured this problem out with the multi_reply section. It would probably be easy to move that fix over to here if this is still an actual issue.

— Reply to this email directly or view it on GitHub.

joelcox avatar Aug 22 '13 20:08 joelcox

Is there anyone still running into this problem using the latest HEAD?

joelcox avatar Sep 19 '13 08:09 joelcox

This thing should be dead. I have tested the hell out of it on my instance with the code i committed.

integrii avatar Nov 04 '13 05:11 integrii

Great, so you're talking about code that is already in the develop branch or referring to #48?

joelcox avatar Nov 04 '13 08:11 joelcox

The stuff in 48. :-( On Nov 4, 2013 12:36 AM, "Joël Cox" [email protected] wrote:

Great, so you're talking about code that is already in the develop branch or referring to #48 https://github.com/joelcox/codeigniter-redis/pull/48 ?

— Reply to this email directly or view it on GitHubhttps://github.com/joelcox/codeigniter-redis/issues/17#issuecomment-27669669 .

integrii avatar Nov 04 '13 09:11 integrii

Thanks, I might have a go myself and fix the style issues in #48.

joelcox avatar Nov 04 '13 10:11 joelcox

Really will eventually fix this. I just got done with about 1000 lines this weekend, have my day job in the morning and Google glass arriving. Did I mention I have a wife and kids?

I need to learn the style guide. I will eventually get there if you don't want to waste your time. On Nov 4, 2013 2:02 AM, "Joël Cox" [email protected] wrote:

Thanks, I might have a go myself and fix the style issues in #48https://github.com/joelcox/codeigniter-redis/pull/48 .

— Reply to this email directly or view it on GitHubhttps://github.com/joelcox/codeigniter-redis/issues/17#issuecomment-27674408 .

integrii avatar Nov 04 '13 10:11 integrii

I didn't mean it like that Eric, sorry if I came across that way.

This library is provided free of charge and I appreciate every single effort in order to make it better. Just wanted to say that I might have a stab at fixing the code style myself, because you already took care of the hard part – actually fixing the bug.

Thanks again for your input.

joelcox avatar Nov 04 '13 10:11 joelcox