Empty questions cause misbehaviour/Pointer glitches
Exhibit 1:
What is your root question?
> <Enter>
Question: [$1: ]
Scratchpad: [$1: ]
Subquestions:
>
Exhibit 2:
What is your root question?
> What is the capital of Assyria?
Question: [$1: What is the capital of Assyria?]
Scratchpad: [$2: ]
Subquestions:
> ask<Enter>
Question: [$1: What is the capital of Assyria?]
Scratchpad: [$q1: ]
Subquestions:
1.
[$q1: ]
$a1
$w1
> unlock $a1
Question: [$1: ]
Scratchpad: [$1: ]
Subquestions:
>
- Question and scratchpad shouldn't have the same pointer.
- Empty questions don't make sense, so I would just forbid them.
I could probably fix this easily, but I don't want to switch context now. Besides, this is a good bug to fix for someone who wants to get to know the codebase.
Another glitch with the scratchpad:
What is your root question?
> Is [] the empty list?
Question: [$1: Is [$2: ] the empty list?]
Scratchpad: [$2: ]
Subquestions:
>
I guess some implementation detail causes null pointers ([]) to always appear unlocked.
Here is yet another glitch:
What is your root question?
> What is the air-speed velocity of an unladen swallow?
Question: [$1: What is the air-speed velocity of an unladen swallow?]
Scratchpad: [$2: ]
Subquestions:
> ask What is the air-speed velocity of an unladen African swallow?
Question: [$1: What is the air-speed velocity of an unladen swallow?]
Scratchpad: [$2: ]
Subquestions:
1.
[$q1: What is the air-speed velocity of an unladen African swallow?]
$a1
$w1
> unlock $a1
Question: [$1: What is the air-speed velocity of an unladen African swallow?]
Scratchpad: [$2: ]
Subquestions:
> reply []
Question: [$1: What is the air-speed velocity of an unladen swallow?]
Scratchpad: [$2: ]
Subquestions:
1.
[$q1: What is the air-speed velocity of an unladen African swallow?]
[$a1: [$2: ]]
$w1
> scratch []
Question: [$1: What is the air-speed velocity of an unladen swallow?]
Scratchpad: [$a1: [$2: ]] <===
Subquestions:
1.
[$q1: What is the air-speed velocity of an unladen African swallow?]
[$a1: [$2: ]]
$w1
>
This is caused by the data deduplication and might confuse a user.
But since Patchwork is not intended for end users, this issue is probably unimportant.
I don't think it's obviously unimportant. Identifying pointers (for the purpose of unlocking within a workspace) that have the same contents, but were created independently, feels like a symptom of a design choice that might be incorrect.
Okay, here is my analysis, then.
Example:
What is your root question?
> Bat?
Question: [$1: Bat?]
Scratchpad: [$2: ]
Subquestions:
> scratch Bat?
Question: [$1: Bat?]
Scratchpad: [$1: Bat?]
Subquestions:
>
The steps are roughly:
- Insert the root question into the database. It gets address A1.
-
Scratch.actinserts ‘Bat?’ into the database (TransactionAccumulator.insert). The content is already in the database, sonew_scratchpad_linkwill be A1 as well. -
Scratch.actconstructs a successor workspace with A1 as both the question link and the scratchpad link. -
Scratch.actconstructs a successor context pointing to that workspace. - The context has to contain a dictionary that maps pointers to addresses and a dictionary that maps addresses to pointers. These mappings are bijective, ie. every pointer is mapped to exactly one address and vice versa.
Context.__init__callsContext._name_pointersto construct these mappings.
In the example above, there is only one address A1, which _name_pointers maps to $1. This is how we end up with the same pointer in the scratchpad and the question, even though they were ‘created independently’.
_name_pointers creates mappings for the sub-question pointers (question, answer, workspace) first, which is how we end up with Scratchpad: [$a1: [$2: ]] in the last example above.
How could we fix this? Options:
- Map multiple pointers to one address. In the reverse we would then have to map one address to multiple pointers.
- In
Datastore.insert: Never return an address twice. Instead, when the content is already present, return an alias. - Give up on the deduplication and content-addressability, and just create a new record and a new address for every insertion into the datastore.
I haven't thought through the consequences of these options, yet.