Bloom save/load routines fail for large filters due to missing write/read loops`
Title
Bloom save/load routines fail for large filters due to missing write/read loops
Description
When saving large Bloom filters, the library’s write and read calls assume that the entire buffer will be processed in one call. On larger filters this assumption fails:
-
On save:
writereturns fewer bytes than requested. The routine checks the return value, sees a mismatch, and reports a generic error (code 1). The file is created and appears full‑size, but is actually incomplete. -
On load:
readalso returns fewer bytes than expected. The routine detects the mismatch and reports error 11 (“bytes read mismatch”).
This sequence makes it look like the file was saved correctly, but reload fails because the data is truncated.
Steps to Reproduce
- Create a Bloom filter with a large number of entries (e.g. millions).
- Save the filter using the current library routines.
- Observe error code 1 on save.
- Reload the filter.
- Observe error code 11 due to incomplete read.
Diagnosis
The save/load routines call write(fd, buf, len) and read(fd, buf, len) once and assume success. POSIX guarantees only that up to len bytes are processed; partial I/O is normal. The return type ssize_t is correct, but the code does not loop until all bytes are handled.
Suggested Fix
Wrap the I/O in loops that retry until all bytes are written/read. For example:
ssize_t full_write(int fd, const void *buf, size_t count) {
size_t total = 0;
const char *p = buf;
while (total < count) {
ssize_t rc = write(fd, p + total, count - total);
if (rc < 0) { if (errno == EINTR) continue; return -1; }
if (rc == 0) break;
total += rc;
}
return total;
}
ssize_t full_read(int fd, void *buf, size_t count) {
size_t total = 0;
char *p = buf;
while (total < count) {
ssize_t rc = read(fd, p + total, count - total);
if (rc < 0) { if (errno == EINTR) continue; return -1; }
if (rc == 0) break;
total += rc;
}
return total;
}
Then update the Bloom save/load routines to use these helpers and verify that the total equals the intended length.
Impact
This fix ensures Bloom filters of any size are saved and loaded reliably, eliminating both the initial save error (code 1) and the subsequent load error (code 11).