cmd-call-graph Follow calls to external batch scripts

We could extend the tool to follow calls to external batch scripts, and produce a larger call graph including a set of scripts.

There is already some primitive logic to identify external calls in CallGraph._AnnotateNode, as it generates Command instances of type external_call, but those commands are not processed in the later loop that goes through all Command instances. Therefore the logic to process external calls can be added to that loop.

What we can do there is simply call CallGraph.Build again to recursively generate another instance of CallGraph, that we would need to add to a container in the original CallGraph instance itself, keeping track of this new type of connection.

There are a few open questions that come to mind:

we need to handle recursion and not fall into the trap of infinite loops (for example, specifying a depth parameter to limit how deep we go in the chain of calls.
we need to de-duplicate call graphs for external batch files, so that if we call a given file multiple times we point to the same sub-graph.
we need to handle gracefully calls to non-batch files or files we don't want to expand (for example if we reached maximum depth). For example, by adding a single node for the file if we don't want or can't expand the given file
we need to decide how to visually represent the different files. For example, we could have rectangular enclosures around each batch file involved in the graph.
we may want to limit the set of files to expand (for example, only the ones belonging to a specific codebase). This might be achieved by specifying a set of files to expand as a command-line parameter.

Sep 19 '20 06:09 lupino3

Some additional thoughts..

regarding points above 1 - We could add an optional parameter to follow the external calls along with the depth argument (default behavior would be as it is today and only parse the starting file). 2 - We could look to check if the external call script exists in the sub-graph first, so that we don't try to parse through it again unnecessarily 3 - For calls to other programs ('non batch' scripts), currently it's represented as 'x # of external calls.. we could represent a node for each with a reference to the name of the item being called 4 - Visually representing as a rectangle would be great 5 - Limiting the set of files to expand is a nice optional argument also

also attached a potential mockup mockup - external call example

Sep 19 '20 19:09 cmg-src

Thanks @cmg-src for volunteering to take this on!

I like the mock-up! I had another idea as well: to have a rectangle enclose all the nodes of the external script, but I'm not sure how easy it is to do in graphviz.

Also, since it's just aesthetics at that point, we can definitely go with whatever works for you first and then maybe iterate.

Thanks again!

Sep 21 '20 07:09 lupino3

Did some research on the idea of to have a rectangle enclose all the nodes of the external script, but I'm not sure how easy it is to do in graphviz.

It looks like this could be handled with subgraph 'clusters' .. [https://graphviz.org/Gallery/directed/cluster.html] I did some prototyping by manually modifying an existing .dot I had from my testing and while it's do-able, graphically it gets busy, and also in terms of rendering logic, the output appears to need to be in a specific order to group the items of the subgraph within that element.

Sep 29 '20 21:09 cmg-src

The cluster is exactly what I had in mind, but if you think it becomes too busy let's stick to the design you have with the folder shape. Thanks!

Sep 30 '20 06:09 lupino3