dmlc-core icon indicating copy to clipboard operation
dmlc-core copied to clipboard

Bug in InputSplitBase::Chunk::Append() ?

Open mazhongtuan opened this issue 4 years ago • 0 comments

When split->ReadChunk(...) returns a true value and size is set to zero, the code double the buffer size, but in the next loop, variable 'size' should also be adjust accordingly before call split->ReadChunk again.

bool InputSplitBase::Chunk::Append(InputSplitBase *split, size_t buffer_size) {
  size_t previous_size = end - begin;
  data.resize(data.size() + buffer_size);
  while (true) {
    // leave one tail chunk
    size_t size = buffer_size * sizeof(uint32_t);
    // set back to 0 for string safety
    data.back() = 0;
    if (!split->ReadChunk(reinterpret_cast<char *>(BeginPtr(data)) + previous_size, &size))
      return false;
    if (size == 0) {
      data.resize(data.size() * 2);
    } else {
      begin = reinterpret_cast<char *>(BeginPtr(data));
      end = begin + previous_size + size;
      break;
    }
  }
  return true;
}

mazhongtuan avatar Apr 29 '21 09:04 mazhongtuan