analyzer icon indicating copy to clipboard operation
analyzer copied to clipboard

Dubious code duplication in `destabilize_vs`

Open Red-Panda64 opened this issue 1 year ago • 9 comments

The destabilize_vs function is an almost verbatim copy of destabilize_normal, except that it is supposed to determine whether any called or start variables were destabilized. This is needed for the side_widen option "cycle".

(* Same as destabilize, but returns true if it destabilized a called var, or a var in vs which was stable. *)
let rec destabilize_vs x = (* TODO remove? Only used for side_widen cycle. *)
  if tracing then trace "sol2" "destabilize_vs %a" S.Var.pretty_trace x;
  let w = HM.find_default infl x VS.empty in
  HM.replace infl x VS.empty;
  VS.fold (fun y b ->
      let was_stable = HM.mem stable y in
      HM.remove stable y;
      HM.remove superstable y;
      HM.mem called y || destabilize_vs y || b || was_stable && List.mem_cmp S.Var.compare y vs
    ) w false

For comparison, here is the normally used destabilize_normal function:

let rec destabilize_normal x =
  if tracing then trace "sol2" "destabilize %a" S.Var.pretty_trace x;
  let w = HM.find_default infl x VS.empty in
  HM.replace infl x VS.empty;
  VS.iter (fun y ->
      if tracing then trace "sol2" "stable remove %a" S.Var.pretty_trace y;
      HM.remove stable y;
      HM.remove superstable y;
      Hooks.stable_remove y;
      if not (HM.mem called y) then destabilize_normal y
    ) w

It seems that destabilize_vs has a few peculiarities:

  • destabilize_vs does not call the Hook (which is probably unintentional)
  • destabilize_vs uses the short-circuiting of the || operator to influence control flow and control recursion
  • depending on config options, destabilize might refer either to destabilize_normal or destabilize_with_side. However destabilize_vs, used by side-effects when the cycle strategy is active, will never behave like destabilize_with_side.

In a discussion with @michael-schwarz, we came to the conclusion that it is a bit dangerous to have this kind of code duplication, as it is likely that future updates in the destabilize variants will not be reflected in destabilize_vs. It might be best to drop destabilize_vs in favor of a more general destabilize function, which takes an optional callback to notify the caller about any destabilized variables. The caller could then perform the check done in destabilize_vs where it is needed.

Red-Panda64 avatar Apr 25 '24 09:04 Red-Panda64

I think the reason we kept it so far is because there's a distinct check there

HM.mem called y || destabilize_vs y || b || was_stable && List.mem_cmp S.Var.compare y vs

The key question would be why that check is different and why that is.

sim642 avatar Apr 25 '24 09:04 sim642

As I understand, the check is different because the side_widen: "cycle" strategy wants to know, whether the function destabilized any stable start variables or called variables.

HM.mem called y || destabilize_vs y

is effectively equivalent to

if not (HM.mem called y) then destabilize_vs y else true

So it behaves the same, except that it also returns this boolean. As mentioned in #1434, this boolean whether a start var etc. was destabilized could be computed in an optional callback action (including the remainder of the check || b || was_stable && List.mem_cmp S.Var.compare y vs)

Red-Panda64 avatar Apr 25 '24 10:04 Red-Panda64

I had another look, and not calling the Hook seems like it was clearly an oversight.

Not using destab_with_sides on the other hand seems odd at a first glance. However, it seems like destabilize_ref is only setting to destab_with_sides during restarting and destablize_vs is only used from side which only happens during the actual solve

destabilize_ref := destabilize_normal; (* always use normal destabilize during actual solve *)

Maybe it suffices to add the call to the Hook, add these observations as a remark and remove the usage of || got control flow as a more conservative solution than #1434.

What do you think @Red-Panda64 @sim642?

michael-schwarz avatar Apr 26 '24 15:04 michael-schwarz

Adding the hook call isn't controversial I think. I don't understand what's the issue with || though, they both recurse if not in called.

sim642 avatar Apr 26 '24 15:04 sim642

It's a clean code thing, I find the more explicit

if not (HM.mem called y) then destabilize_vs y else true

to be a lot more readable than the formulation with ||. I'd be surprised if there's any performance differences between both styles.

michael-schwarz avatar Apr 26 '24 15:04 michael-schwarz

But it's also not a hill I'm willing to die on 😉

michael-schwarz avatar Apr 26 '24 15:04 michael-schwarz

I stand by the opinion that the duplicate code itself is problematic. Namely, that adjustments of either destabilize or destabilize_vs could accidentally not be applied to the other. This does happen, the missing Hook call proves this.

Of course, if you think think the cost of refactoring this outweighs the benefits, we can also just add the Hook and call it a day. As for the short-circuiting ||, I don't see a reason not to make this more explicit, if we are already making small adjustments to destabilize_vs. I think that short-circuiting with side-effects leads to a bit more confusing code whereas removing it should be uncontroversial.

Red-Panda64 avatar Apr 29 '24 07:04 Red-Panda64

It's a clean code thing, I find the more explicit

if not (HM.mem called y) then destabilize_vs y else true

to be a lot more readable than the formulation with ||. I'd be surprised if there's any performance differences between both styles.

Ah, that seems reasonable.

I thought there might've been some subtle issue with the recursion not happening in some cases when it should've. For example, reordering the disjuncts could easily cause this (e.g. putting b before the recursive call). They are in the correct order right now, but an explicit if (with a comment indicating the intent) would avoid such issue being introduced in the future.

sim642 avatar Apr 29 '24 07:04 sim642

As to the code duplication, destabilize_with_side is an even worse offender with very similar loops (only over different sets): https://github.com/goblint/analyzer/blob/06bc1e1e47f760df217ed45387743e82b9488d69/src/solver/td3.ml#L568-L605 Getting rid of the duplication is a bit of a separate issue and should probably cover all the instances then.

sim642 avatar Apr 29 '24 07:04 sim642