Question about snapshot support
Problem
I need snapshot support because there are domain models with a large number of events and a long lifecycle. Initially, I tried to split the models into 'daily models'; this solved the long lifecycle problem but added complex and fragile day-closing/opening logic, which is not part of the business logic and rather exists as a technical workaround. Also, some instances of these models can still remain 'highly loaded' during the day, so the problem is only partially solved.
I would like to add support for 'Rolling snapshots', but before that, I want to discuss the implementation to make sure it aligns with your vision and that the PR will be accepted. I have seen several issues related to snapshots, and in some cases, changes were even suggested, but they were rejected. Also, these issues are quite old, and the perspective may have changed by now.
Here's how I see the implementation
It is assumed that rolling snapshots are stored as events in the stream along with the rest of the domain model events.
The first thing we need to do is somehow distinguish snapshot events from other events. Here, I suggest adding an attribute HasRollingSnapshots where you can pass a list of event types that are snapshots. This attribute should be applied to the State type.
Next, in the LoadState and LoadAggregate methods, we try to get the HasRollingSnapshots attribute and pass a list of types to the ReadStream method. If the list contains at least one type, ReadStream starts reading the event stream from the end until it finds an event whose type matches one of the types in the list. Also, ReadStream does not return events that come before the snapshot event.
The question remains as to how the model determines when to apply snapshots and when not to. But it seems to have everything it needs for this. In the case of aggregates, there is the Original field, which allows you to check the number of events or the time elapsed since the last snapshot, and in the case of a model without an aggregate, the list of events is also passed to the decider function (possibly this is precisely why they were added).
Hey, I am working on a different version of snapshotting, still thanks for the PR.
Question that I've got after a brief look: do we need a separate type map? The existing map is quite agile, it doesn't care what it is. I think it should be enough. There's a source generator for the type map, it would be hard to maintain two maps.
@alexeyzimarev, hi, to me the existing TypeMap looks like it is designed specifically to store the mapping 'clr event type -> logical event type', while for snapshots a mapping 'clr state type -> clr events types' was required. At that time, it seemed that it was better not to mix this in one map. But overall, yes, it seems there's nothing stopping us from turning TypeMap into a metadata store for events and adding 'indexes' to search for metadata by logical event name / clr event type / clr state type.
I made a separate source generator because the mapping from snapshot types to state types was moved to the Snapshots attribute, which is applied to the state. Yes, it does seem that it would have been better to add a field in the EventType attribute containing the state types and slightly modify the existing generator.
Is it possible to see another version of snapshotting that you are talking about somewhere, and somehow help make it available for public use faster? I assume it’s snapshotting using external storage (including a separate stream in KurrentDB). In that case, is it possible to keep both strategies?
Ok, got it. Makes sense. About "my version" - I don't have code yet, just design thoughts. The Snapshots feature is basically the only feature that I want to add before releasing 1.0, and the demand is high.
The thoughts are:
- Snapshotting belongs to persistence (same as you did)
- Also thought about an attribute
- Allow using different strategies. What comes to mind is count (like every 100) and event type (like
DayClosed) - Separate contract as state record can have properties that aren't deserializable (complex value objects) (same as you did)
-
State.Whenwould apply the snapshot pretty much as any other event - Allow different persistence patterns
- Same stream
- Different stream
- Different store (can be mutable)
-
LoadStatewould transparently use snapshots, no code changes required in services, only the wiring - Same for persisting snapshots
I think that next week I will be able to make the necessary changes to the PR so that it takes into account all the specified points.
From a usage perspective, it won't be much different from what already exists. A few things will change:
- A field for specifying the storage strategy will be added to the Snapshots attribute
- The Command Service will gain the ability to specify an "interceptor/strategy" that can decide whether to emit a snapshot event (a ready-made strategy based on the number of events can be provided here; the rest seem to require explicit user definition, although the list of ready-made strategies can be expanded in the future)
- It will be necessary to register the ISnapshotStore implementation in DI when using external storage
Hi @alexeyzimarev, I’ve made changes taking into account the mentioned details. Now, in the Snapshots attribute, it’s possible to specify a snapshot storage strategy (all three options), in the Command Service constructor you can call the UseSnapshotStrategy method (while, as before, it’s still possible to return an event-snapshot from aggregate/model methods), and implementations for PostgreSQL, SQL Server, MongoDB, and Redis storage have also been added. The sample application has been updated to use any of the storage options mentioned above (just uncomment the one you need in Program.cs)
Cool, I will start looking at it over the weekend.