Use Case: Prompt & Parameter Exploration

Open daniel-richter opened this issue 2 years ago • 1 comments

I want to share some insights for a special use case: Prompt and parameter explorations. That simply means a pure txt2image workflow with 100% variation strength without any inpainting etc. to explore how different prompts and parameters affect the outcome.

Krita AI Diffusion is fantastic for trying things out on the one hand and seeing exactly what a workflow looks like on the other. That "seeing what a workflow looks like" part is especially great because it is quite easy to build custom tools on top of that. I used mitmproxy to see the requests to ComfyUI (it's also great that this plugin comes with a ready-to-use Docker image and a corresponding runpod template) and used them as a template in a small tool specialized for prompt exploration. It's just a small image gallery like TypeScript/Node.js application that displays generation metadata and an extended prompt field.

There are some things that (currently or formerly) make it difficult in Krita (resp. easier in the separate tool) to perform that specific use case:

Pattern Expansion ComfyUI supports wildcards/dynamic prompts with syntax {wild|card|test}, but I don't want to have the different possibilities randomly replaced but want to have a separate prompt for each value. You could queue three different prompts for each replacements, but the strength of pattern expansion is revealed when you use multiple patterns. For example A {blue|red} {car|truck} in the {city|country}. produces eight different prompts. Pattern expansion is also very useful if you try out LoRAs and want to see the effects of different weights - e.g. <lora:myawesomelora:{0|0.2|0.4|0.6|0.8|1}> in combination with a fixed seed.

Seeds Speaking of seeds - as a fixed (or random) seed is an integral part of prompt & parameter exploration, the seed value in Krita is quite hidden in a dropdown. I caught myself playing around in Krita with a fixed seed by mistake.

Promt/Parameter Reuse In this use case, you usually generate a rather large amount of images with different parameters and then (re)use the best results. To be able to do that efficiently, the Krita plugin currently does not provide enough information about the parameters used. There is only seed and prompt. You can't see, e.g. negative prompt, skip layers, cfg scale. If you are using the Face tool, you can't see that this tool was used, neither the used reference image nor the values of the two parameters.

Delete Results If you have a lot of generation results, you may want to delete some of them.

Queue Front While having a long queue (e.g., due to pattern expansion), you may want to enqueue some prompts at the front of the queue. ComfyUI provides the parameter front:true for /prompt to do that.

My code for prompt expansion:

function expandPrompt(prompt: string): string[] {
  const match = prompt.match(/(?<!\\)\{([^{}]+)(?<!\\)\}/)
  if (match) {
    const result: string[] = []
    for (const option of match[1].split("|")) {
      const subPromt = prompt.replace(match[0], option)
      result.push(...expandPrompt(subPromt))
    }
    return result
  } else {
    return [prompt]
  }
}

expandPrompt('A {blue|red} {car|truck} in the {city|country}.') creates the following eight (2×2×2) prompts: A blue car in the city, A blue car in the country, A blue truck in the city, A blue truck in the country, A red car in the city, A red car in the country, A red truck in the city, A red truck in the country.

Feb 02 '24 14:02 daniel-richter

I understand the use case, but also it's something I deliberately excluded so far. Editing the image through a mix of inpaint and traditional image tools is the focus, because it has most synergy with Krita, and also there is a lack of apps with comparable power/flexibility.

Meanwhile there are already many tools for prompt exploration which expose and track all the parameters. It requires lots of UI which is easier to develop in a JS frontend where it doesn't steal real estate from the canvas and other dockers. In Krita, SD is just one of many tools, and maybe not even an important one.

That is, if I try to come up with the ideal UI for txt2img prompt exploration, it would look very different.

That being said, there is always overlap, and better prompting tools have made it into the project and are useful. As long as they are relatively unobstrusive.

Promt/Parameter Reuse: Negative prompt is restored together with prompt. Saving/restoring control layers I considered, but they may no longer exist or have changed content. It might make sense regardless though, especially for face/reference. The Style and its settings is something I consider as relatively fixed while working on an image.

Seeds: I agree some indicator when it is set to fixed would be nice.

Feb 05 '24 09:02 Acly