reactor icon indicating copy to clipboard operation
reactor copied to clipboard

Add ability to use watch without an at option to fire events

Open sirwolfgang opened this issue 8 years ago • 7 comments

Currently, the update flow checks to see if that watch or at fields were changed, but then proceeds to only enqueue a worker if the at field is in the future. This makes it basically impossible, as far as I can tell, to trigger an immediate event-based on a change.

This change makes it possible to trigger an event to fire based on any watch that does not include an at. This then makes it possible to do an on update hook with publishes :updated, watch: :updated_at.

This should have no effect on existing usage while allowing a new range of behavior. Minor change.

sirwolfgang avatar Mar 29 '18 21:03 sirwolfgang

@sirwolfgang this is awesome!

First, I do regret that you've gotten no response until now, but I've had some other things to attend to recently. 👶 😃 I'm back and ready to rock though.

Next, I'm really digging the rubocop cleanup. Thank you for doing this. 👍 👍 Can you split that out into a second PR?

Last, the primary topic sounds great too. I really appreciate you catching this and fixing it.

How much are you using the Reactor::Publishable api in your app with publishes :...? If it's still in early usage for you, would you be willing to consider an alternate paradigm? (can talk it out here/somewhere/anywhere)

winfred avatar May 17 '18 16:05 winfred

something like hired/chron that polls the DB instead of relying on Sidekiq's ScheduledSet

class Course < ActiveRecord::Base
  include Chron::Observable

  at_time :close_at do
    do_anything_for_this_resource
    publish :some_event
    # self == self.class.find(id)
    # self is an instance whose :close_at is now
    # this block gets fired at the :close_at time
  end
end

winfred avatar May 17 '18 16:05 winfred

@wnadeau I not sure I understand the problem you're trying to avoid or fix by suggesting this alternate usage pattern? I see a lot of value in having my async callback like system to be homogeneous. Our usage of this system has somewhat been evolving on our master branch, though I am currently working on some bugs I seemed to have introduced, with plans to extend some additional flexible priority-based scheduling.

sirwolfgang avatar May 17 '18 17:05 sirwolfgang

Aha, I'm really glad we're talking. Your point re: the value of homogeneity makes a lot of sense. Sometimes I have a hard time telling the difference between 'good separation of concerns' and 'hey this belongs in here', though I had come to the (possibly erroneous) conclusion that this was a good separation of concerns after using this in production for a few years. Would love to hear more about your rationale.

And yes you're right, I should better describe the underlying problem. (am going to use bold for optional quick-reading)

The Problem

The biggest pain-point with Sidekiq::ScheduledSet and Reactor::Publishable is found when refactoring code. If one changes names of things like name of event, the class that contains the publishes, or the column being watched, then Sidekiq::ScheduledSet jobs will need to be modified in redis as well. (The current implementation stores hard references in redis to meta-programmed classes.) Note the brute-force implementation of #reschedule with its own performance implications being born out in #82

It also feels strange having to replicate the state of "when" in both the database (the watch/at columns) and in redis. Especially as the number of publishes usages goes up in a codebase, the amount of state being synced between database and redis grows with it, and so do the ways that state could get out of whack.

With even a medium sized team, it's been particularly hard to remember to have to change that redis state when refactoring code. (also hard to automate)

A Potential Solution

By inverting the paradigm (using a poller job instead of ScheduledSet), we separate the Reactor concerns from the "temporal" concerns and seemingly produce a less error-prone system in production. But this is all un-practiced theory that we at hired/hired are also wondering about (perhaps 'unpaid tech debt' even) because in the end we have ~5 pretty basic auxiliary usages, so it hasn't been a priority. (in comparison to your seemingly more-business-critical usages).

Open Questions

  • What is the right underlying implementation (redis vs. db-poller)
  • What is the right interface/API? (publishes :whatever, at: :whenever vs. at_time :whenever { publish :whatever }
  • Where should we put this? (improve Reactor::Publishable vs. remove-and-use-another-gem)

winfred avatar May 17 '18 18:05 winfred

Oh right, we had a conversation about this (maybe not a great conversation) in #21

Between #82 #67 #53 as well, I feel like something must be here, but it'll take time/research to sort out the details.

winfred avatar May 17 '18 18:05 winfred

I tend to try to break the separation of concerns via the interface, or the API that I want to have. In this case, I am looking to have any timed event(which is effectively a callback) to trigger async calls. So either the time is in the future, or in the now, as long as I am tracking a change artifact I want to have the same interface. Which is why I made the original change. It seems incongruent to me that watching 1.hour.from_now is a different behavior than watching Time.current in terms of an interface.

I have spent a lot of time working on async multi-step and batch based worker processes and my philosophy around managing them comes down to the idea of doing work when you can. My ideal approach is the moment you know that you want to have a job a time, you enqueue/schedule that job. When that time, (or priority) changes you enqueue a new job with the new settings. Then I run the winner, whichever job happens first, and early exit the losers.

The downside of this is that if you're trying to push back a job to run later, then you need to find and kill the older members in the queue; (or if you know enough about when you want to run, early exit aswell) But I find this case to be more on the rarer side. Normally I need to get work done faster, and not slower.

With this, transitional states as with refactoring can be an issue because Redis is out of date. I also have zero expectations that I will remember to update Redis entries to migrate over. When it comes to this case, I tend to act the same way I would with zero downtime deployments with the database. Which means operating partly in an intermediate state. In something like this, that would mean to me keeping the old event receiver, then having it pass through to the new event until all old enqueues are exhausted. Basically acted as a connector/translator between the states.

This strategy has a few things to note, one of the beings that I basically try to avoid polling (other than what is being handled on sidekiq's scheduler). Part of this has to do with work/load balance. If you're not processing jobs fast enough, for whatever reason, your time between the job being pushed into the active queue and being processed can become greater then the time between polls. Which then can resurface the out of sync problem, if that desync spiral continues to grow. (And that's before considering the polling looking around strategy/trade-offs) By enqueuing upfront, and then just exiting unnecessary work I find that you get to spread the workload pretty evenly instead of having periodic spikes and potential descyncing events.

sirwolfgang avatar May 17 '18 19:05 sirwolfgang

Just a note that I ran into this just now. My expectation would be that I can fire off an event updating a model the same way I can from creating it - I see reactor as a pub/sub system first and scheduling agent second, so this behavior was surprising to me.

I'm planning on pulling in the relevant parts of this PR as a monkeypatch for now.

davekincade avatar Feb 18 '20 21:02 davekincade