v8.7.1 transaction freeze
Prerequisites
- [x] I have written a descriptive issue title
- [x] I have searched existing issues to ensure the bug has not already been reported
Mongoose version
v8.7.1
Node.js version
20.16.0
MongoDB server version
Atlas Latest
Typescript version (if applicable)
5.6.3
Description
const session = await this.jobModel.db.startSession(); this code freezes on v8.7.1, but works fine on v8.5.1
Steps to Reproduce
try to start session
Expected Behavior
no freezes
Can you please provide a more complete repro script, like with how you're instantiating your model and where you're calling startSession()? We have transaction tests that passed on 8.7.1 release so the issue isn't as simple as "startSession() is broken in all cases on 8.7.1"
This issue is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 5 days
This issue was closed because it has been inactive for 19 days and has been marked as stale.
I'm experiencing the same issue. We have a test suite in which we experience this issue but each test passes in isolation so the code is difficult to reproduce. Downgrading to 8.5.5 did solve the problem for me.
I can share some debug logs. We do drop all our collections between each unit test. We use the mongodb driver to drop the collection (in the after each hook) and then use mongoose to start a session at which point the process seems to freeze. I'm trying to create a reproducible example.
MongoMS:MongoInstance Mongo[40299]: stdoutHandler: ""{"t":{"$date":"2024-12-31T11:01:54.979+02:00"},"s":"I", "c":"COMMAND", "id":518070, "ctx":"conn32","msg":"CMD: drop","attr":{"namespace":"test.orders"}}"" +17ms
MongoMS:MongoInstance Mongo[40299]: stdoutHandler: ""{"t":{"$date":"2024-12-31T11:01:54.979+02:00"},"s":"I", "c":"COMMAND", "id":518070, "ctx":"conn25","msg":"CMD: drop","attr":{"namespace":"test.users"}}"" +0ms
MongoMS:MongoInstance Mongo[40299]: stdoutHandler: ""{"t":{"$date":"2024-12-31T11:01:54.979+02:00"},"s":"I", "c":"STORAGE", "id":20314, "ctx":"conn32","msg":"dropCollection: storage engine will take ownership of drop-pending collection","attr":{"namespace":"test.orders","uuid":{"uuid":{"$uuid":"424adbb1-674c-4723-9793-31d7afb59a28"}},"dropOpTime":{"ts":{"$timestamp":{"t":0,"i":0}},"t":-1},"commitTimestamp":{"$timestamp":{"t":0,"i":0}}}}"" +0ms
MongoMS:MongoInstance Mongo[40299]: stdoutHandler: ""{"t":{"$date":"2024-12-31T11:01:54.979+02:00"},"s":"I", "c":"STORAGE", "id":20314, "ctx":"conn25","msg":"dropCollection: storage engine will take ownership of drop-pending collection","attr":{"namespace":"test.users","uuid":{"uuid":{"$uuid":"5903f952-7054-47a9-8606-04534e698af1"}},"dropOpTime":{"ts":{"$timestamp":{"t":0,"i":0}},"t":-1},"commitTimestamp":{"$timestamp":{"t":0,"i":0}}}}
MongoMS:MongoInstance {"t":{"$date":"2024-12-31T11:01:54.979+02:00"},"s":"I", "c":"STORAGE", "id":20318, "ctx":"conn32","msg":"Finishing collection drop","attr":{"namespace":"test.orders","uuid":{"uuid":{"$uuid":"424adbb1-674c-4723-9793-31d7afb59a28"}}}}
MongoMS:MongoInstance {"t":{"$date":"2024-12-31T11:01:54.979+02:00"},"s":"I", "c":"STORAGE", "id":20318, "ctx":"conn25","msg":"Finishing collection drop","attr":{"namespace":"test.users","uuid":{"uuid":{"$uuid":"5903f952-7054-47a9-8606-04534e698af1"}}}}"" +0ms
MongoMS:MongoInstance Mongo[40299]: stdoutHandler: ""{"t":{"$date":"2024-12-31T11:01:54.979+02:00"},"s":"I", "c":"STORAGE", "id":22206, "ctx":"conn32","msg":"Deferring table drop for index","attr":{"index":"_id_","namespace":"test.orders","uuid":{"uuid":{"$uuid":"424adbb1-674c-4723-9793-31d7afb59a28"}},"ident":"index-548--1581127540233199531","commitTimestamp":{"$timestamp":{"t":1735635714,"i":66}}}}
MongoMS:MongoInstance {"t":{"$date":"2024-12-31T11:01:54.979+02:00"},"s":"I", "c":"STORAGE", "id":22214, "ctx":"conn32","msg":"Deferring table drop for collection","attr":{"namespace":"test.orders","ident":"collection-547--1581127540233199531","commitTimestamp":{"$timestamp":{"t":1735635714,"i":66}}}}"" +0ms
MongoMS:MongoInstance Mongo[40299]: stdoutHandler: ""{"t":{"$date":"2024-12-31T11:01:54.979+02:00"},"s":"I", "c":"STORAGE", "id":22206, "ctx":"conn25","msg":"Deferring table drop for index","attr":{"index":"_id_","namespace":"test.users","uuid":{"uuid":{"$uuid":"5903f952-7054-47a9-8606-04534e698af1"}},"ident":"index-546--1581127540233199531","commitTimestamp":{"$timestamp":{"t":1735635714,"i":67}}}}
MongoMS:MongoInstance {"t":{"$date":"2024-12-31T11:01:54.979+02:00"},"s":"I", "c":"STORAGE", "id":22214, "ctx":"conn25","msg":"Deferring table drop for collection","attr":{"namespace":"test.users","ident":"collection-545--1581127540233199531","commitTimestamp":{"$timestamp":{"t":1735635714,"i":67}}}}"" +0ms
...
MongoMS:MongoInstance Mongo[40299]: stdoutHandler: ""{"t":{"$date":"2024-12-31T11:01:55.365+02:00"},"s":"I", "c":"TXN", "id":51802, "ctx":"conn32","msg":"transaction","attr":{"parameters":{"lsid":{"id":{"$uuid":"d655c7e5-2603-4582-9eb2-e8763763c49c"},"uid":{"$binary":{"base64":"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=","subType":"0"}}},"txnNumber":66,"txnRetryCounter":0,"autocommit":false,"readConcern":{"level":"local","provenance":"implicitDefault"}},"readTimestamp":"Timestamp(0, 0)","keysExamined":8,"docsExamined":26,"ninserted":5,"keysInserted":5,"terminationCause":"committed","timeActiveMicros":2308,"timeInactiveMicros":103305,"numYields":0,"locks":{"FeatureCompatibilityVersion":{"acquireCount":{"w":1}},"ReplicationStateTransition":{"acquireCount":{"w":3}},"Global":{"acquireCount":{"w":1}},"Database":{"acquireCount":{"w":2}},"Collection":{"acquireCount":{"w":6}},"Mutex":{"acquireCount":{"r":54}}},"storage":{},"wasPrepared":false,"durationMillis":105}}"" +120ms
MongoMS:MongoInstance Mongo[40299]: stdoutHandler: ""{"t":{"$date":"2024-12-31T11:02:02.735+02:00"},"s":"I", "c":"EXECUTOR", "id":6983000, "ctx":"conn3","msg":"Slow SessionWorkflow loop","attr":{"elapsed":{"totalMillis":10005,"activeMillis":10005,"receiveWorkMillis":0,"processWorkMillis":10005,"sendResponseMillis":0,"finalizeMillis":0}}}"" +60d
@t1bb4r can you please provide some code samples that show exactly what you mean by "We use the mongodb driver to drop the collection (in the after each hook) and then use mongoose to start a session"?
@vkarpov15 I was finally able to reproduce this issue after trying out mongoose 8.10. Here is a link to the code that I've been using to reproduce the issue, https://github.com/t1bb4r/mongoose-issue-14949/blob/main/test.js.
In mongoose 8.10, instead of my tests freezing I got a new error "Connection operation buffering timed out after 10000ms". With this new error I noticed that manipulating the clock with sinon was the root cause of the problem. Adding sinon useFakeTimer to my test was the missing piece all along.
The shared test file results with different mongoose versions: 8.6.0 works 8.7.0 - 8.9 tests simply timeout after freezing on startSession 8.10 "Connection operation buffering timed out after 10000ms" - but only after 5 or 6 tests, not immediately. How many tests pass seems to different on different machines.
After some investigation I suspect that these heartbeat changes do not play well with sinon fake timers, maybe this PR https://github.com/Automattic/mongoose/pull/14812? Now that I've finally started smelling the smoke I can investigate a bit further and will let you know if I find something.
I think that I understand the issue a lot better now after some reading. I can explain the issue that I'm having by focusing on this piece of code:
if (
this._readyState === STATES.connected &&
this._lastHeartbeatAt != null &&
typeof this.client?.topology?.s?.description?.heartbeatFrequencyMS === 'number' &&
Date.now() - this._lastHeartbeatAt >= this.client.topology.s.description.heartbeatFrequencyMS * 2) {
return STATES.disconnected;
}
In my case the test suites will run, and most tests will finish in under 30 seconds so heartbeat will be null for the full duration of the test and the issue will not appear. The issue only appears after the first heart beat sets the date, and then date is then modified by sinon causing it to be larger than heartbeatFrequencyMS * 2 resulting in a disconnected state which then causes the buffer timeout.
At the end of the day it is not a mongoose issue, but it's difficult to manipulate the tests and timers to get this working. Could we opt out of this check in our testing environment since the connection is still alive?
It is worth mentioning that we recommend not using useFakeTimers(). Stubbing out time can mix up both MongoDB and Mongoose's connection monitoring, which both rely on heartbeat intervals and checking how much time has elapsed.
That being said, the easiest way to work around this issue is to disable bufferCommands: await mongoose.connect(mongoUri, { bufferCommands: false });. Mongoose will still report readyState = 'disconnected', but it won't buffer any commands.