It's not safe to edit planning solution in best solution consumer
Describe the bug Best solution consumer of solver manager is fired before deep clone of the planning solution has been completed. This lead to problems if you need to post-process the data returned from timefold solver. I recognized it because timefold throws in some cases IllegalStateExecption with "Verify the consistency of your input problem for that bi-directional relationship." message if I manually change content of bi-directional data.
Expected behavior
- Planning Solution gets first deep cloned
- Best Solution consumer is triggered after deep clone is completed. So it is safe to manipulate the Solution passed to best solution consumer.
Actual behavior Planning Solution consumer and deeclone running in different thread and are not synchronized. If you edit a planning solution in the solution consumer it can influence the running planning if the edit took place before deepclone happend.
To Reproduce
- Create a Planning with a PlanningEntity having a bi-directional relationship(invers Shadow Variable)
- Start solving with solver manger and add a best solution consumer
- In best solution consumer change one side of the bi directional relation ship
- Wait for the IllegalStateException. It happens only if bestSolutionConsumer thread is faster than the thread executing the planning
Environment
Timefold Solver Version or Git ref: 1.21.0
Output of java -version: Eclipse Temurin JDK 21.0.6+7-LTS
Output of uname -a or ver: 24.3.0 Darwin Kernel Version 24.3.0: Thu Jan 2 20:24:23 PST 2025; root:xnu-11215.81.4~3/RELEASE_ARM64_T6020 arm64
Additional information
Provide any and all other information which might be relevant to the issue.
Hello @jason076, and thanks for reporting!
What you're saying makes sense - the solution needs to be cloned before the event is fired. In fact, it has been designed like this, and it's worked like this for years. Which makes me question if there could be something else going on as well.
Any chance you could provide a reproducer that we could run on our end?
I will try to find some time to do it. What I already checked is that my deep cloning is working correct. So thats not the issue.
I am still trying to replicate it. What I found until now is that is only happens on phase change between construction heuristic and local search, hence on the first found solution. If I start manipulating the solution after local search started everything is working fine.
I finally could reproduce it. Just run gradle build and gradle run and it should throw the exception.
reproducer.zip
Thank you, @jason076!
I saw that you have an open PR for the issue. Is there any estimation when the fix will be released? Will it take more than two weeks to be available in a new version? It's important for me to know, because if it will be released soon, I will create no workaround for it.
The next release is scheduled for June 10th.
FYI @jason076 We will be reverting the commit we introduced to "fix" this issue.
The fix is not a fix. It prevents users from affecting the solution by changing entites. But it doesn't prevent the user from affecting the solution by changing problem facts, or anything else in the solution - those aren't being cloned at all.
In fact, the "fix" makes the situation worse - it introduces cloning performance overhead, while still requiring users to remember to never touch an intermediate best solution while the solver is running.
We will be updating the docs and Javadocs to say just that - intermediate best solutions can not be modified safely and without side effects. Only the final best solution can be safely modified.