tracks icon indicating copy to clipboard operation
tracks copied to clipboard

Optimise updating a large dependency tree

Open dnrce opened this issue 11 years ago • 17 comments

Migrated from the original issue at https://www.assembla.com/spaces/tracks-tickets/tickets/1453

I have a single todo item that many other todos depend on. Let's call it X. In numbers, there are 42 tasks depending on X. Now, while this is slightly unusual, it should not cause problems.

However, I often experience long delays when I update a todo that somehow is connected to X. Changing the project of a single todo which depends on X from project A to project B just took 67 seconds according to the log file: "Completed 200 OK in 67441ms (Views: 66598.7ms | ActiveRecord: 526.7ms)". My server is not very fast, but changing the project from some todo not related to X just takes 0.3 seconds. In the past I also observed single updates that took about half an hour.

In the log file for the fast change I see this:

Processing by RecurringTodosController#update as JS Parameters: (something) Rendered recurring_todos/_recurring_todo.html.erb (25.1ms) Rendered recurring_todos/update.js.erb (42.3ms) Completed 200 OK in 297ms (Views: 54.8ms | ActiveRecord: 131.8ms)

For the slow change I see:

Processing by TodosController#update as JS Parameters: (something) Read fragment views/todos/3148-20140302144709/20140302/project/true/f6b1f1676905500ed5685760722d24b6 (19.3ms) Write fragment views/todos/3148-20140302144709/20140302/project/true/f6b1f1676905500ed5685760722d24b6 (6.1ms) Rendered todos/_todo.html.erb (177.0ms) Read fragment views/todos/2502-20140302104308/20140302/project/true/f6b1f1676905500ed5685760722d24b6 (23.2ms) Rendered todos/_successor.html.erb (24.6ms) Rendered todos/_successor.html.erb (17.4ms) Rendered todos/_successor.html.erb (17.2ms) Rendered todos/_successor.html.erb (17.6ms) Rendered todos/_successor.html.erb (18.1ms) Rendered todos/_successor.html.erb (47.0ms) Rendered todos/_successor.html.erb (139.2ms) Rendered todos/_successor.html.erb (17.3ms) Rendered todos Rendered todos/update.js.erb (67003.5ms) [...] Completed 200 OK in 67441ms (Views: 66598.7ms | ActiveRecord: 526.7ms)

In the part marked with [...] I see 2673 lines showing "Read fragment views/todos/somethingA/somethingB/project/true/somethingC" where somethingB and somethingC are fixed values. I also see 2674 lines of "Rendered todos/_todo.html.erb". In between I have 33 lines of "Write fragment views/todos/somethingA/somethingB/project/true/somethingC", again with somethingB and somethingC being fixed. Finally, there also are another six lines "Rendered todos/_successor.html.erb".

The problem can easily be reproduced, so feel free to ask me for further information if necessary.

Originally reported by carsten.otto on March 3, 2014 at 11:25:39 (+0530) against version 2.2.2

dnrce avatar Jun 25 '14 16:06 dnrce

On March 3, 2014 at 11:32:20 (+0530), carsten.otto commented:

Sorry, the log file copy-paste was slightly wrong. The line "Rendered todos/update.js.erb (67003.5ms)" appears directly before "Completed 200 OK". The fragment "Rendered todos" (without anything behind that) does not appear in the log file at all, this was a copy-paste-mistake.

dnrce avatar Jun 25 '14 16:06 dnrce

On March 3, 2014 at 19:53:54 (+0530), mattr- commented:

Just to make sure I reproduce it successfully, was 42 just a random number that you pulled out of a hat or is it the exact number of dependent tasks?

dnrce avatar Jun 25 '14 16:06 dnrce

On March 3, 2014 at 20:00:51 (+0530), lrbalt commented:

A todo with multiple dependencies has always been troublesome :-)

If you “touch” a todo or one of its dependencies, all todos are marked touched. This causes the cached rendered partials to be invalidated. You can see this happening in the log Carsten attached. Since all todos are touched, also the todos in de “deferred/pending” container are rerendered. So one todo with 42 dependend todos wil regenerate 42+1=43 times a _todo.html.erb. The _successor.html.erb is the partial that is shown when you open the triangle. So if all 42 are directly dependent on the one root-todo (aka procastination), it will render 42 _successor.html.erb. Things will get more complex when the 42 dependend todos form a deep tree. That would mean various _successors being rendered on _todo partials in the deferred/pending container

What I don’t understand is why a todo with 42 dependend todos causes 2673 renderings. Do you have that many todos on your page? Reading a fragment is retrieving the cached partial without rendering it, so for 2673 todos that is expected behavior

The 33 writes are newly / updated rendered partials that are written to disk.

dnrce avatar Jun 25 '14 16:06 dnrce

On March 3, 2014 at 03:42:15 (+0530), carsten.otto commented:

Matt, it was exactly 42. Right now it is 40, but it still is slow. Reiner, it's not procrastination, but writing a PhD thesis. Sadly, this task is not as easy to accomplish :) I'm getting there, though. As for your understanding issue: I don't really know. The page shows at most 20 items, I'd say. But if it is important, some of the todos leading to "X" (which is completing the whole thesis) represent chapters, most of which also have a large number of incoming/outgoing dependencies. However, in the updated mentioned in the bug report, those todos were not touched directly.

dnrce avatar Jun 25 '14 16:06 dnrce

On March 3, 2014 at 20:18:29 (+0530), lrbalt commented:

I'm curious how you use dependencies. Is the root-todo "write PhD thesis" with "sub-todos" for "write chapter 1" and "write chapter 2" with sub-todos for "read paper A for chapter 1"? In GTD terms, you should consider defining fine-grained projects for large todos :-)

The way you define your dependecy tree could be helpfull to find out where the big numbers come from. Since you only have 20 items, why are there 2673 renderings?

dnrce avatar Jun 25 '14 16:06 dnrce

On March 3, 2014 at 00:35:01 (+0530), carsten.otto commented:

[[file:dJeh0eO88r44oFacwqjQYw]] dump of dependencies table

dnrce avatar Jun 25 '14 16:06 dnrce

On March 3, 2014 at 00:39:36 (+0530), carsten.otto commented:

I added a dump of the dependencies table, I hope that helps. The task "X" has ID 2502.

On the organization of my todos: Organization of my thesis actually is rather nice (I think), with several projects for chapters (or parts of chapters!). However, there is a lot of stuff that I only want to start after I hand in my thesis (for example, reorganizing Track's state model of todos, as I mentioned in some mail or bug report). So while there are some todos I still need to do in order to hand in my thesis, there also is a pile of todos that I'd like to do only afterwards. I think this is a rather natural use of dependencies (where I'd love to model dependencies also based on projects, not only todos, as mentioned earlier).

dnrce avatar Jun 25 '14 16:06 dnrce

On March 3, 2014 at 00:40:35 (+0530), carsten.otto commented:

Another thing, with 20 items I meant what I see on the screen in an individual project. However, I did not feel any change in speed depending on how many todos are shown on my screen. From my experience, the connected (invisible) todos may be more relevant.

dnrce avatar Jun 25 '14 16:06 dnrce

The dependencies file from a previous comment (which is only available in Assembla otherwise):

http://server.c-otto.de/~cotto/tracks-dependencies.txt

C-Otto avatar Jul 03 '14 19:07 C-Otto

depgraph

I rendered the dependencies using graphviz. If I understand your comments correctly, when changing node 2502 a change of 2503 is triggered. This triggers 2504. This triggers 2423. This triggers 2425 and 2429 (and others). From 2429 we trigger 2388 and then 2391. From 2425 we directly trigger 2391. So for this small part of the graph, we trigger 2391 twice? If so, does that mean we run through the successors of 2391 at least twice?

If this is correct, the complexity of the algorithm is incredibly bad (for my use case). I'd like to see a variation that only works on each node once (starting at the leaves, then going up, ...?).

C-Otto avatar Jul 03 '14 22:07 C-Otto

The arrows in that graph are wrong in the sense that the target depends on the source. Anyway, I created a test database which only contains the todos with their dependency relation, all in one project and context. Using this database I played around a bit by editing the description of todo 2502. When not showing the successors (removed _todo.html.erb line 43), this takes 48 seconds. When limiting the recursion depth (_successor.html.erb) to 0, this takes 72 seconds. The unmodified code takes 270 seconds.

So far, I have no clear understanding why updating all this information is even necessary. Can't this be done on demand, instead of upfront? Is it possible to share computed elements, such as a subtree of successors?

C-Otto avatar Jul 05 '14 14:07 C-Otto

@lrbalt: Could you explain this sentence? From "On March 3, 2014 at 20:00:51 (+0530), lrbalt commented"

"Reading a fragment is retrieving the cached partial without rendering it, so for 2673 todos that is expected behavior"

C-Otto avatar Jul 05 '14 14:07 C-Otto

Sorry for the late reply

I'm impressed by the dependency tree. OMG :-) I can understand why it renders so slowly. Please keep in mind that Tracks is not a project planning tool, but a next-action manager. If was not designed with trees like yours :-)

Problem is that, like you said, an edit on 2502 will touch the whole tree. And for all todos that are present in the current view (i.e. in a context/project container or in the deferred/blocked container) it will cause a rendering of the partial. This cannot really be avoided in the current way Tracks is showing todos, because all todos with successors has a dependency tree that may need updating and in most views, all successors are in the deferred container and may have a dependency tree (subtree) itself. So they need to be rendered too. Furthermore, there is a tooltip on the blocked-tag which shows the description of the todo it is blocked on. It may need updating too.

Current implementation is all or nothing (aka naive), so if you update the description of 2502, the complete dependency tree doesn't actually need rerendering, only 2502 itself and all todos showing this description in a tooltip. There I see room for optimization in your use case where you selectively touch only those todos in the dependency tree that actually need rerendering.

In light of Tracks 3.0 we may need to rething how we show dependencies for large dependency trees like yours

wrt my confusing sentence: in the log you see renderings of partials which are saved in the cache and you see reading of cached renderings. if the view contains 2673 todos, the log will always show 2673 entries, either reading from the cache or rerendering the partial.

I've added caching of containers too, so a context in the view of which no todos is touches will be read as a whole from the cache. But in your case, one simple edit will touch the whole tree. When a todo to modified or touched, the corresponding context is touched too, causing the context-container in the view to be rerendered, etc.

Hope this clarifies things a bit

lrbalt avatar Jul 10 '14 08:07 lrbalt

You can experiment with improving the naive all-or-nothing approach by modifying the update method in todos_controller to only touch the successors is the change needs rerendering.

Another optimization could be that editing a todo only touches the first level of successors without recursively touching their children.

Be carefull with special cases, like putting 2502 in the tickler...

lrbalt avatar Jul 10 '14 08:07 lrbalt

@lrbalt Great explanation!

dnrce avatar Jul 10 '14 13:07 dnrce

After a few years of using tracks (and upgrading my server, and finishing the aforementioned PhD thesis), I'd say fixing this is not really that important. Feel free to close this issue!

C-Otto avatar Jun 10 '19 13:06 C-Otto

There seems to be a lot of information here, so I changed the title to be more descriptive and changed it into an optimization task – it's a valid request after all.

ZeiP avatar Jun 10 '19 14:06 ZeiP