v6.4.1 seems to introduce a bug with gzipped files
Environment details
- OS: Cloud Run
- Node.js version: 18.3.0 (ALPINE: 18.3-alpine3.15 Docker image)
- npm version: 8.12.1
-
@google-cloud/storageversion: 6.4.1
Steps to reproduce
I have a log file processing pipeline on Cloud Run/Node which is pretty well monitored and managed via CI/CD (GitHub -> Cloud Build -> Cloud Run). Dependabot raised a PR for 6.4.1 of this lib which passed all my tests and ran fine overnight on pre-live so I merged it which deploys to live. Immediately, the number of log files being rejected by the pipeline jumped and I saw that the Node app was logging errors:
gzip error: unexpected end of file (code: Z_BUF_ERROR)
The processing time per file also jumped massively (by an order of magnitude or 2).
Once I reverted this version of the lib, the errors disappeared and processing time returned to normal.
The Node app Does this:
- Receives a "new GCS file" push from Eventarc (HTTP lib is Fastify)
- Streams the new file through the processing pipeline, using gunzip-maybe
- Streams resultant data into BigQuery
So I think the usage of this lib is relatively simple - i am only using a few methods (via async/await):
- setMetadata()
- exists()
- bucket()
- file()
- delete()
- createReadStream() - probably the most likely cuplrit here, i guess
Please let me know if you have more Qs. Sorry this isn't a simple set of steps, I thought ^ was more practical in this case.
Making sure to follow these steps will guarantee the quickest resolution possible.
Thanks!
@neilstuartcraig what version of the library did you upgrade from when you saw this issue?
Hey @neilstuartcraig - could you provide a sample that replicates the issue and any additional associated logs?
@neilstuartcraig what version of the library did you upgrade from when you saw this issue?
Oh sorry, I forgot that. I upgraded from 6.4.0. Hopefully that helps narrow down what to look at as the changes were minimal between the two (as shown in https://github.com/googleapis/nodejs-storage/releases/tag/v6.4.1)
Hey @neilstuartcraig - could you provide a sample that replicates the issue and any additional associated logs?
I'm really sorry, I can't provide the logs files as they contain personal information so i'd be breaking GDPR - and if i amend the files, they obviously won't be the same. What I can say though is that they're gzipped text (nginx access logs in default "combined" format).
The code which showed the errors was:
import {default as gunzip} from "gunzip-maybe";
import {Storage} from "@google-cloud/storage";
const googleCloudStorage = new Storage();
// ...
const sourceBucket = googleCloudStorage.bucket(executionDetails.log_bucket);
const sourceFile = sourceBucket.file(executionDetails.log_filename);
// ...
sourceFile.createReadStream()
.on("error", async (e) => {
await handleStreamError(e, "GCS file readstream creation", executionDetails, log, startTSwithMS, auditLogBigQueryTable);
return reject(new Error(executionDetails.messages));
})
.pipe(gunzip()) // Note that this is maybe-gunzip which does a noop for non-gzipped content (which allows us to use uncompresed files for e.g. e2e tests)
.on("error", async (e) => {
// The below is the error I was seeing: "@handleStreamError/readstream gzip error: unexpected end of file (code: Z_BUF_ERROR)."
// Note that gunzip is actually gunzip-maybe to allow non-gzipped files to be processed as well a gzipped
await handleStreamError(e, "readstream gzip", executionDetails, log, startTSwithMS, auditLogBigQueryTable);
return reject(new Error(executionDetails.messages));
})
.pipe(es.split())
.on("error", async (e) => {
await handleStreamError(e, "readstream split", executionDetails, log, startTSwithMS, auditLogBigQueryTable);
return reject(new Error(executionDetails.messages));
})
.pipe(es.map(async (rawLine, mapCallback) => {
// ...
I can't easily give you all of the code as it's across a few files and won't work in isolation but I am hoping that the above will be at least a pointer. If not, please let me know and i'll see what I can do to get you something better.
Cheers
Hey @neilstuartcraig, after extensive debugging I'm unable to reproduce this error. Here's a working sample with a variety of test scenarios:
import gunzipMaybe = require('gunzip-maybe');
import {Storage} from '@google-cloud/storage';
import {randomBytes} from 'crypto';
import {gzipSync} from 'zlib';
const storage = new Storage();
const bucket = storage.bucket(/* insert bucket name for testing */);
const testFile = bucket.file('test-gzip-object');
async function setupCase1() {
// random data
await testFile.save(randomBytes(1024));
console.log('uploaded - setupCase1');
}
async function setupCase2() {
// random data, upload pre-gzipped
await testFile.save(gzipSync(randomBytes(1024)));
console.log('uploaded - setupCase2');
}
async function setupCase3() {
// random data, gzipped on upload
await testFile.save(randomBytes(1024), {gzip: true});
console.log('uploaded - setupCase3');
}
async function returnDecompressed() {
console.log('reading from stream...');
await new Promise((resolve, reject) => {
testFile
.createReadStream()
.on('error', reject)
.pipe(gunzipMaybe())
.on('data', d => console.log('got data', d.byteLength))
.on('end', resolve)
.on('error', reject);
});
console.log('finished reading from stream.');
}
async function main() {
await setupCase1();
await returnDecompressed();
await setupCase2();
await returnDecompressed();
await setupCase3();
await returnDecompressed();
console.log('done');
}
main();
Do you have any other information that would help us debug this issue for you?
Hey @neilstuartcraig, do you have any other information that would help us debug this issue for you?
Since my last message, I expanded the test cases and verified a number of additional scenarios:
import gunzipMaybe = require('gunzip-maybe');
import {Storage} from '@google-cloud/storage';
import {randomBytes} from 'crypto';
import {gzipSync} from 'zlib';
const storage = new Storage();
const bucket = storage.bucket(/* insert bucket name for testing */);
const testFile = bucket.file('test-gzip-object');
async function setupCase1() {
// random data
await testFile.save(randomBytes(1024));
console.log('uploaded - setupCase1');
}
async function setupCase2() {
// random data, upload pre-gzipped
await testFile.save(gzipSync(randomBytes(1024)));
console.log('uploaded - setupCase2');
}
async function setupCase3() {
// random data, gzipped on upload
await testFile.save(randomBytes(1024), {gzip: true});
console.log('uploaded - setupCase3');
}
async function setupCase4() {
// random data, upload pre-gzipped w/ Content-Encoding
await testFile.save(gzipSync(randomBytes(1024)), {
metadata: {
contentEncoding: 'gzip',
},
});
console.log('uploaded - setupCase4');
}
async function setupCase5() {
// random data, upload pre-gzipped w/ Content-Encoding + Content-Type
await testFile.save(gzipSync(randomBytes(1024)), {
metadata: {
contentEncoding: 'gzip',
},
contentType: 'application/gzip',
});
console.log('uploaded - setupCase5');
}
async function setupCase6() {
// multiple buffers of random data, concat & uploaded pre-gzipped
await testFile.save(
Buffer.concat([gzipSync(randomBytes(1024)), gzipSync(randomBytes(1024))]),
{
metadata: {
contentEncoding: 'gzip',
},
}
);
console.log('uploaded - setupCase6');
}
async function setupCase7(list = ['']) {
// list of strings to concat, concat & uploaded pre-gzipped
const data = Buffer.concat(list.map(str => gzipSync(str)));
await testFile.save(data, {
metadata: {
contentEncoding: 'gzip',
},
});
console.log('uploaded - setupCase7');
}
async function returnDecompressed() {
console.log('reading from stream...');
let buffer = Buffer.alloc(0);
await new Promise((resolve, reject) => {
testFile
.createReadStream()
.on('error', reject)
.pipe(gunzipMaybe())
.on('data', d => {
console.log('got data', d.byteLength);
buffer = Buffer.concat([buffer, d]);
})
.on('end', resolve)
.on('error', reject);
});
console.log('finished reading from stream.');
return buffer;
}
async function main() {
await setupCase1();
await returnDecompressed();
await setupCase2();
await returnDecompressed();
await setupCase3();
await returnDecompressed();
await setupCase4();
await returnDecompressed();
await setupCase5();
await returnDecompressed();
await setupCase6();
await returnDecompressed();
const list = ['a', 'b'];
await setupCase7(list);
const results = await returnDecompressed();
console.log(list, results.toString());
console.log('done');
}
main();
A diff link for reference: https://github.com/googleapis/nodejs-storage/compare/v6.4.0...v6.4.1
Thanks so much for digging in to this @danielbankhead - I really appreciate it.
I am not really sure what I can provide to you which might help, I could give you source code but it won't work in isolation, you'd have to set up the whole environment in Terraform which is no small undertaking!
Would it help if I provided the source code and some data format examples? For the data, I am mainly thinking the format that comes in from the Eventarc/PubSub HTTP push as a result of object.finalize in GCS - I guess you're probably pretty familiar with that.
I could try deploying 6.4.1 again I guess, I'm just hesitant as this is a production system which processes our www and media CDN access logs so it's our eyes on the world to a fair extent.
For ref, here's what happened last time:
Overall log processing status - the purple dashed line indicates the time we upgraded to 6.4.1, the right-hand dashed yellow line indicates the rollback. The upgrade to 6.4.1 was the only change (see GitHub screengrab below it):


Let me know if you think the source code and/or re-trying the upgrade would help. Cheers
Thanks! And no problem - no need to share source code or sensitive information 🙂
Other places to check:
- Did any other dependencies in your dependency change between the two versions for you? This can be assessed via
npm ls --allor by comparingpackage-lock.jsonbefore and after.- I wonder if removing
pumpifybetween releases allowed another package to pick a different version ofpumpify
- I wonder if removing
-
The processing time per file also jumped massively (by an order of magnitude or 2).
- Are there any logs or performance hooks you could use to determine where in the pipeline in things are getting slower?
console.time,process.hrtime.bigint, or something from the Performance measurement API could help.
- Are there any logs or performance hooks you could use to determine where in the pipeline in things are getting slower?
I'm on leave at the moment but just had a look at the dependabot PR and the package-lock.json diff looks like this:
So yeah, I think from a quick glance, that looks like pumpify was affected. I am not familiar enough with the details to know for sure but maybe you are? I can get you the before/after raw files if that helps?
I don't have any custom timings for the app because they're expensive to store - the app processes several billion log lines per day on a modest budget so i have to limit what i log, unfortunately.
Cheers!
A few other places to check:
- How are the objects uploaded? Is it possible somehow some objects were legitimately gzipped incorrectly (and we're looking at a symptom of another bug)?
- Are there any other variables different between your pre-live and live environments (that you can share)?
- Is it possible any other software updates were included in the deploy? Say, any subtle dependency changes that were included in the deployed container?
Additionally, does your company have a support contract with Google? If so, we have an opportunity to dive further with additional tools & resources.
How are the objects uploaded? Is it possible somehow some objects were legitimately gzipped incorrectly (and we're looking at a symptom of another bug)?
Deploys to Cloud Run are done via GitHub -> Cloud Build (Terraform) -> Cloud Run. Hopefully this means things should be working as intended by Google Cloud.
Are there any other variables different between your pre-live and live environments (that you can share)?
Only some peripheral elements such as log verbosity and number of logs. The app code itself is identical (assuming we're not mid-deploy with dev being ahead of prod as we roll out).
Is it possible any other software updates were included in the deploy? Say, any subtle dependency changes that were included in the deployed container?
I don't think so, the Dockerfile is identical and I lock npm dependency versions to exact (rather than semver range) to make deploys reproducible.
Feels like it might be what you suggested previously, a quirk of dependency chains. I am wondering if I ought to try the upgrade again and keep an eye on it so I can roll back quickly if needed. Not sure there's much of a way to debug without a super deep dive.
Additionally, does your company have a support contract with Google? If so, we have an opportunity to dive further with additional tools & resources.
Yes, we do. I think I'm right in saying we have enterprise support. Happy to engage via that route if it's preferable.
Thanks again for all your time and help with this.
Thanks for the additional information @neilstuartcraig, I believe https://github.com/googleapis/nodejs-storage/issues/2055 is related to the issue you are facing.
I will do some more digging and resolve this for you.
Hey @neilstuartcraig, thanks for your patience - we were finally able to reproduce the issue and a fix has been merged. The next release will public shortly:
- https://github.com/googleapis/nodejs-storage/pull/2050
To better mitigate and catch issues like this in testing in the future we're planning to implement a revamp of our stream implementation in the next major release:
- https://github.com/googleapis/nodejs-storage/issues/2041
Hey @neilstuartcraig, thanks for your patience - we were finally able to reproduce the issue and a fix has been merged. The next release will public shortly:
* [chore(main): release 6.4.2 #2050](https://github.com/googleapis/nodejs-storage/pull/2050)To better mitigate and catch issues like this in testing in the future we're planning to implement a revamp of our stream implementation in the next major release:
* [refactor(deps)!: Remove `duplexify` dependency + Revamp Stream Implementation #2041](https://github.com/googleapis/nodejs-storage/issues/2041)
Sorry for the delay in reply...busy times! I rolled this out to dev and prod environments a few days back and it looks good. Thanks so much for your work and time on this, I really appreciate it.