patchwork icon indicating copy to clipboard operation
patchwork copied to clipboard

Re-asking a locked-pointer question causes endless loop

Open rmoehn opened this issue 7 years ago • 13 comments

Don't try to understand the title. Look at this scenario instead:

What is your root question?
> Who wrote Gödel, Escher, Bach?
Question: [$1: Who wrote Gödel, Escher, Bach?]
Scratchpad: [$2: ]
Subquestions:

> ask $1
Question: [$1: Who wrote Gödel, Escher, Bach?]
Scratchpad: [$2: ]
Subquestions:
1.
  [$q1: [$1: Who wrote Gödel, Escher, Bach?]]
  $a1
  $w1

> unlock $a1
Question: [$1: $3]
Scratchpad: [$2: ]
Subquestions:

> ask $1
[Goes into endless loop.]

The same happens with sub-questions of sub-questions.

The endless loop occurs in Context.is_own_ancestor:

    def is_own_ancestor(self, db: Datastore) -> bool:
        initial_workspace = db.canonicalize(self.workspace_link)
        context: Optional[Context] = self.parent
        while context is not None:
            if context == self and db.canonicalize(context.workspace_link) == initial_workspace:
                return True
            context = context.parent
        return False

Apparently the context has become its own ancestor, so context.parent never is None.

Of course it doesn't make sense to re-ask a context's question, so patchwork should detect this kind of scenario and raise an error. More concretely: Raise an error if the argument to AskSubquestion is a pointer to the current context's question.

This is related to issue #12.

rmoehn avatar Jul 04 '18 00:07 rmoehn

A variation:

What is your root question?
> [0]
Question: [$1: $3]
Scratchpad: [$2: ]
Subquestions:

> ask [What about $1?]
[Goes into endless loop.]

So we have to forbid sub-questions that consist of a pointer that points to hypertext containing a pointer to the context's question? There might be scenarios when this is needed, but I don't have time to think up an example now.

rmoehn avatar Jul 05 '18 00:07 rmoehn

Note that if $3 in the example above is unlocked, it won't go into an endless loop.

What is your root question?
> [0]
Question: [$1: $3]
Scratchpad: [$2: ]
Subquestions:

> unlock $3
Question: [$1: [$3: 0]]
Scratchpad: [$2: ]
Subquestions:

> ask [What about $1?]
Question: [$1: [$3: 0]]
Scratchpad: [$2: ]
Subquestions:
1.
  [$q1: $4]
  $a1
  $w1

> 

In general, it doesn't make sense to ask a question like [What is the secret of the pyramids?], ie. a question with brackets around it. It will appear locked to the first H working on the question's context, so they have to unlock it before they can do anything. Asking What is the secret of the pyramids? without the brackets has the same effect and doesn't trigger failures.

Therefore, we should forbid sub-questions of the form […].

rmoehn avatar Jul 05 '18 23:07 rmoehn

It also leads to failure in this case:

What is your root question?
> [0]
Question: [$1: $3]
Scratchpad: [$2: ]
Subquestions:

> ask [1]
Encountered an error with your command: 
Question: [$1: $3]
Scratchpad: [$2: ]
Subquestions:

> Traceback (most recent call last):
  File "/home/erle/repos/patchwork/patchwork/interface.py", line 40, in _do
    result = self.session.act(action)
  File "/home/erle/repos/patchwork/patchwork/scheduling.py", line 255, in act
    resulting_context = self.sched.resolve_action(self.current_context, action)
  File "/home/erle/repos/patchwork/patchwork/scheduling.py", line 166, in resolve_action
    raise ValueError("Action resulted in an infinite loop")
ValueError: Action resulted in an infinite loop

I think this arises because the locked root question and the locked sub-question look the same to the memoizer.

rmoehn avatar Jul 05 '18 23:07 rmoehn

The infinite loop in the initial issue is a bug, but the later ValueError: Action resulted in an infinite loop is expected behavior that will be dealt with by decreasing budgets in the future. (If automation looks at budgets, the contexts won't be identical and so it won't be a cache hit. If automation doesn't look at budgets, it will apply the same action until it runs out of budget, which isn't the right thing to do, but not something the system should disallow.)

stuhlmueller avatar Jul 13 '18 00:07 stuhlmueller

So you're against forbidding things like sub-questions of the form […]?

In a case that detectably doesn't make sense, I would not let the system run until it's out of budget. It would introduce mysterious load spikes, make generative tests slow or less useful, and generally screw up client code that probes the boundaries of what Patchwork accepts.

rmoehn avatar Jul 13 '18 02:07 rmoehn

It's not so much that I'm against it, but that I want general solutions instead of patching specific cases. Both the current cycle detection scheme and budgets are general, and you're right that in the long run, we want something even better, so that we don't have to rely on letting the system run until it's out of budget. If you have ideas for other general solutions, I'm definitely interested.

stuhlmueller avatar Jul 13 '18 02:07 stuhlmueller

I see. And I just realized that patching specific cases won't cut it anyway, because they keep turning up, seemingly every time I think that I can finish the PR for the Hypothesis tests:

What is your root question?
> 0[1]
Question: [$1: 0$3]
Scratchpad: [$2: ]
Subquestions:


> ask 0$1
[Goes into endless loop.]

I'm not sure if this counts as a general solution, but for debugging I modified Context.is_own_ancestor thus:

    def is_own_ancestor(self, db: Datastore) -> bool:
        initial_workspace = db.canonicalize(self.workspace_link)
        context: Optional[Context] = self.parent
        seen = set()
        while context is not None:
            if context in seen:
                raise RuntimeError("Endless loop!")
            seen.add(context)
            if context == self and db.canonicalize(context.workspace_link) == initial_workspace:
                return True
            context = context.parent
        return False

So far all endless loops have occurred in that place, so this modification catches all of them. However, I feel that this is not as close to the root of the problem as I can get. I have to clarify my understanding of the memoizer and the datastore.

rmoehn avatar Jul 13 '18 05:07 rmoehn

In case you're curious about the Hypothesis test: https://github.com/rmoehn/patchwork/blob/test-hypothesis-2/tests/test_randomly.py

rmoehn avatar Jul 13 '18 05:07 rmoehn

Context.is_own_ancestor doesn't prevent some endless loops from happening, because there is a discrepancy between it and the memoizer. The memoizer determines only from the string representation of a context c whether or not it has seen c before. If it has seen c before, it returns the action that the user originally took in c. So it's a mapping str(context) ↦ action. is_own_ancestor in contrast, considers a context equal to own ancestor only when their string representations and their canonicalized workspace links are equal.

Take the last failure case. We have this context, call it c1:

Question: [$1: 0$3]
Scratchpad: [$2: ]
Subquestions:

And the user takes action ask 0$1. So the memoizer stores a mapping str(c1) ↦ ask 0$1. The result of the action is this context:

Question: [$1: 0$3]
Scratchpad: [$2: ]
Subquestions:

Looks the same as the previous one, right? That's what the memoizer thinks, too, so it returns ask 0$1 as the next action to take. Thus we enter the infinite loop.

Why doesn't is_own_ancestor detect this? Because even though the string representations are the same, the workspaces are not the same. I don't know yet how best to resolve this. But probably is_own_ancestor and the memoizer should at least have the same criteria for comparison.

rmoehn avatar Jul 17 '18 00:07 rmoehn

We assume that H takes the same action B whenever she sees a context A. The purpose of the memoizer is to save H's work by learning H's context-to-action mapping. Since the only information H gets about a context is its string representation, the memoizer must work with only the string representation as well.

What I don't understand is why is_own_ancestor takes into a account the workspace link in addition to the string representation. I don't think this is correct, but there must be a reason why @benwr implemented it like that.

rmoehn avatar Jul 18 '18 00:07 rmoehn

There are valid cases where a context looks the same (when stringified) as one of its ancestors, but differs based on the values behind pointers (which depend on the workspace link). For example, consider the following context:

Q: map function #1 over list #2

In the process of applying the function to one of the values in #2, we might encounter another situation that looks exactly the same, but in fact #1 and #2 now refer to different things, so it's fine, and we would like to apply the automated response we learned the first time around.

stuhlmueller avatar Jul 18 '18 00:07 stuhlmueller

Ah, makes sense. Sometimes I need to be shown the forest again. Here is Ben's answer, which is similar:

The main reason is that sometimes subproblems that are actually "smaller" can nonetheless look the same to H. Take the gif in the readme for example: to determine if a list is sorted, you can ask if a sublist is sorted as a subquestion. This new context will naively look like it's an ancestor of itself, though it shouldn't result in an infinite loop.

rmoehn avatar Jul 18 '18 23:07 rmoehn

I guess there is no general way of detecting cycles, then. (On Patchwork With an Application to the Entscheidungsproblem?) For one, the user might ask for infinite data:

R: Give me an infinite list of ones.
S: Prepend a 1 to list [[1] []] and ask the same sub-question as this with the new list.

Of course, this is not a sensible way to use the system, but there might be less obvious infinite-data-generating loops than this one.

As one can see from Andreas' and Ben's answers, the system has to permit a cycle of stringified context → action → … → same stringified context → same action (scasca), because this cycle might be using up data that the contexts are pointing to. When all data is used up, the cycle terminates.

From this we can derive a necessary condition for a finite scasca cycle: It contains an Unlock. If it doesn't contain an unlock, it doesn't use data to decide whether or not to terminate and therefore never decides to terminate.

Are there other necessary conditions? We could use them to improve the cycle detector.

I'm not sure if these statements are true. I'm tempted to prove them instead of waving my hands, but probably there are more important things to do.

rmoehn avatar Jul 21 '18 04:07 rmoehn