Transient prompts for passive plugins
I've been thinking about ways to improve on the basic ChatGPT plugin model. It seems to me that it has two major limitations which could be overcome.
-
Full documentation for the plugin must be specified in the prompt. This limits how much documentation can be passed in, costs compute, etc.
-
The plugin can only supply information sparsely/infrequently and upon request, meaning the model must know exactly when the plugin can help and how to use it.
To address these limitations, I propose that the model be supplemented with a "transient prompt". During output generation, plugins may watch the user prompt and the generated text, and choose to temporarily insert text into the transient prompt at any point in time.
Although I've had this idea independently, I expect that I'm not the first.
A few example uses:
-
If the model has just generated the text
The prime minister of the United Kingdom, a knowledge graph plugin may detect the topic and insert the transient promptThe current prime minister of the UK is Rishi Sunak, the leader of the Conservative Party.This prompt would be available as the model continues generating text, until the plugin no longer considers it relevant. -
If the user requests that the model generate Mathematica code, the Wolfram plugin may lookup functions which may be relevant to the topic, and transiently insert their documentation. More generally, this allows plugins to insert context specific documentation on how to use them.
-
A Lean Prover plugin may provide an up-to-date Lean Context as each line of the proof is written, helping with proof generation.
-
During code generation, IDE style autocompletion information and function documentation may be provided. For example, if the code is using some type
Foo, the documentation forFoomay transiently appear in the context. -
Simple mathematical expressions such as
128 * 9 + 3may be automatically solved and their solutions inserted into the transient prompt whenever they're detected in user or output text. The model doesn't need to call WolframAlpha if it gets automatically told the answer.
I believe that this could be a very powerful architecture for augmenting the models, in ways which go beyond occasional and intentional tool use. Of course, this would be a somewhat compute intensive augmentation, so would have to be used sparingly.
I expect that best results would be achieved by fine tuning the models on how to use this additional context, rather than prompt engineering it into the existing models. It might be particularly powerful to fine tune the model alongside popular plugins, so that it can better learn to utilise them. This could be done by including transient prompts in the dataset.
Particularly for the code generation task, it might be interesting to allow transient plugins to immediately tell the model when it makes an error, such as using a nonexistent function in a library. However, actually using this information may be slightly harder for the model. The flow would have to be something like:
- Model outputs code using nonexistent function.
- Plugin provides error, perhaps suggests correction and relevant documentation.
- Plugin outputs some "we're in a bad state" signal.
- Output sampling backtracks, and retries generation, now with the additional information in the transient prompt. This is hard - how far do we backtrack?
This error mechanism could also be used for automatic fine tuning.
I believe that the basic idea of additional transient prompts is a powerful one. I expect I'm not the first to propose it. It would be very cool if the OA project was interested in exploring it as ideas around plugins/integrations solidify.