ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

implemented generic type validation for all nodes.

Open shawnington opened this issue 1 year ago • 8 comments

This change started trying to transparently implement BHW3 compliance for IMAGE type inputs and outputs for all nodes, and has since started to expand into a general way to check type conformance for all types, with the goal of enabling a much more robust typing system. No changes need to be made to existing nodes, the conformance checking is done before data is passed into a node, and after it leaves the nodes.

Validation functions can be added to type checks in addition, to perform data transformations that allow custom type to parse multiple datatypes into a format that conforms with the type definition of that type.

The current implementation uses a cli-arg to set the level of desired type-conformance and warning.

--type-conformance=["all", "image", "error", "none"]

  • "error": will raise an error at the first node with a type error, with the name of the node, and the input/output variable name, expected type, and type received.
  • "image": will only convert BHW and HW mono channel images in to IMAGE conforming BHW3 images that will work with the base compositing nodes. This will not force a BHW4 image into BHW3
  • "all": will force conversion to the type expected, using either defaults such as int|float|string, or custom formatting functions supplied to the validation generator function.
  • "none": this is the default value, and will simply soft error the same information as the "error" flag to console. The name of the node, the name of the input/output, the expected type, and the received type.

This is what a raised error looks like when wrong types are passed, the same message default to console by default without interrupting workflow execution.

Screenshot 2024-08-03 at 9 39 45 AM Screenshot 2024-08-03 at 9 41 13 AM

The "image" option currently serves as en example of how custom validation can be done for any type by injecting a custom validation and error handling function.

These change were discussed on the #backend-development discord with @comfyanonymous @mcmonkey4eva @guill and several others, along with in this PR

More rigorous testing needs to be done to ensure that there are no egregious failure modes and that all formats are covered.

Changes that are being looked at include creation of a TypeValidator class to replace the dict that controls the generate_type_validator function and opens up the possibility for node packs to define their own types and validation functions that apply globally to any nodes using that type.

Replacing this:

    validation_funcs = { 
        "IMAGE": generate_type_validator((torch.Tensor,), validator=validate_image_shape),
        "INT": generate_type_validator((int,)), 
        "FLOAT": generate_type_validator((float,)), 
        "STRING": generate_type_validator((str,)), 
    }

With something like this:

class TypeValidator:
    validation_funcs = {
        "IMAGE": generate_type_validator((torch.Tensor,)),
        "INT": generate_type_validator((int,)),
        "FLOAT": generate_type_validator((float,)),
        "STRING": generate_type_validator((str,)),
    }

    @classmethod
    def add_type(cls, type_name, valid_types, validator=None):
        cls.validation_funcs[type_name] = generate_type_validator(valid_types, validator=validator)
    
    def has_type(self, type_name):
        return type_name in self.validation_funcs.keys()

    def __getitem__(self, type_name):
        return self.validation_funcs[type_name]

    def __call__(self, type_name, value):
       return self.validation_funcs[type_name](value) if self.has_type(type_name) else value

and called like:

TypeValidator.add_type("IMAGE", (torch.Tensor,), validator=validate_image_shape)

validation_funcs = TypeValidator()

value = validation_funcs("IMAGE", value)

#or the more verbose way 
value = validation_funcs["IMAGE"](value) if validation_func.has_type("IMAGE") else value

shawnington avatar Jul 31 '24 01:07 shawnington

Are we offering a image with alpha type to replace the BHW4 image usage now?

https://github.com/comfyanonymous/ComfyUI/blob/c24f897352238f040e162a81d253c290635d44fd/comfy_extras/nodes_compositing.py#L178-L201

huchenlei avatar Jul 31 '24 15:07 huchenlei

Are we offering a image with alpha type to replace the BHW4 image usage now?

https://github.com/comfyanonymous/ComfyUI/blob/c24f897352238f040e162a81d253c290635d44fd/comfy_extras/nodes_compositing.py#L178-L201

As it currently is written, it shouldn't touch images in BHW4 format, but it will only convert a BHW mask type to BHW3 per @comfyanonymous preferences.

I personally think BHW4 or another image format specifically for alpha inclusion would be a good use for this kind of validation as new function can be added for any type with the latest changes.

shawnington avatar Aug 02 '24 20:08 shawnington

BHW mask type to BHW3

What is the reasoning I don't get it, why not BHW1? Currently, it's BHW

I personally think BHW4 or another image format specifically for alpha inclusion would be a good use for this kind of validation as new function can be added for any type with the latest changes.

The issue with that is that it limits a bit integrations since you would now need an optional BHW4 and an optional BHW3 or separate nodes to handle IMAGE

melMass avatar Aug 02 '24 21:08 melMass

BHW mask type to BHW3

What is the reasoning I don't get it, why not BHW1? Currently, it's BHW

I personally think BHW4 or another image format specifically for alpha inclusion would be a good use for this kind of validation as new function can be added for any type with the latest changes.

The issue with that is that it limits a bit integrations since you would now need an optional BHW4 and an optional BHW3 or separate nodes to handle IMAGE

BHW will display properly in the preview node, but will not composite properly with out conversion to BHW3, and since the only real use for converting mask to image is to do composite operations, having it in the BHW3 format that IMAGE type is supposed to be in, make it work with all of the nodes that do compositing operations.

The issue is basically that the input and output type currently only limit what nodes can connect to each other, and have no bearing on what what type is actually being passed around, or if its in a standardized format.

The reasoning is that at least for base nodes, things should move around in a stand format that is expressed by the type in the node definition, so that custom nodes at least know what they will be getting. If output is specified as an INT, you shouldn't get a FLOAT, or a STRING or a tensor, but all are currently possible.

This started off just dealing with mask images not compositing like you would reasonably expect, and has expanded into a broad type conformance system.

shawnington avatar Aug 02 '24 22:08 shawnington

BHW will display properly in the preview node, but will not composite properly with out conversion to BHW3, and since the only real use for converting mask to image is to do composite operations

I can be slow on these kinds of things and I'm not a native English speaker, but I still don't understand the need to duplicate channels (BHW1 vs BHW3). I see now that it's under a flag so I will try this PR with my nodes to grasp it better

melMass avatar Aug 03 '24 15:08 melMass

BHW will display properly in the preview node, but will not composite properly with out conversion to BHW3, and since the only real use for converting mask to image is to do composite operations

I can be slow on these kinds of things and I'm not a native English speaker, but I still don't understand the need to duplicate channels (BHW1 vs BHW3). I see now that it's under a flag so I will try this PR with my nodes to grasp it better

So this all started because in comfy_extras/nodes_mask.py the composite function requires both images to be 3 channel

def composite(destination, source, x, y, mask = None, multiplier = 8, resize_source = False):
    print(f"destination: {destination.shape} source: {source.shape}")
    destination = tensor_to_rgb(destination)
    source = tensor_to_rgb(source)
    
    source = source.to(destination.device)
    if resize_source:
        source = torch.nn.functional.interpolate(source, size=(destination.shape[2], destination.shape[3]), mode="bilinear")

    source = comfy.utils.repeat_to_batch_size(source, destination.shape[0])

    x = max(-source.shape[3] * multiplier, min(x, destination.shape[3] * multiplier))
    y = max(-source.shape[2] * multiplier, min(y, destination.shape[2] * multiplier))

    left, top = (x // multiplier, y // multiplier)
    right, bottom = (left + source.shape[3], top + source.shape[2],)

    if mask is None:
        mask = torch.ones_like(source)
    else:
        mask = mask.to(destination.device, copy=True)
        mask = torch.nn.functional.interpolate(mask.reshape((-1, 1, mask.shape[-2], mask.shape[-1])), size=(source.shape[2], source.shape[3]), mode="bilinear")
        mask = comfy.utils.repeat_to_batch_size(mask, source.shape[0])

    # calculate the bounds of the source that will be overlapping the destination
    # this prevents the source trying to overwrite latent pixels that are out of bounds
    # of the destination
    visible_width, visible_height = (destination.shape[3] - left + min(0, x), destination.shape[2] - top + min(0, y),)

    mask = mask[:, :, :visible_height, :visible_width]
    inverse_mask = torch.ones_like(mask) - mask

    source_portion = mask * source[:, :, :visible_height, :visible_width]
    destination_portion = inverse_mask  * destination[:, :, top:bottom, left:right]

    destination[:, :, top:bottom, left:right] = source_portion + destination_portion
    return destination

This fails because it tries to access destination.shape[3], after doing a movedim(-1, 1), and iterate through the channels. This results in an index out of range error, because it cannot access channel 3 in a 1 channel image.

If you take a mask and convert it to an image with PIL (or even just output it from a variable that has RETURN_TYPE = {("IMAGE",),}, you will get a tensor(H,W) back after you convert it back to a tensor. This would display in the PreviewImage node perfectly fine, when coming from a node with "IMAGE" as the output type.

After discussion on discord, we agreed that if something outputs as an "IMAGE" and previews properly, it should behave properly when put into another node with an "IMAGE" input.

However, since the composite function used by multiple default nodes required both images to be 3 channel to work, it was decided that forcing HW to BHW3 so that it will work with default nodes that handle images

In the process of implementing this, I realized that there is not an actual type system in place to make sure that 'IMAGE' even outputs a torch.Tensor. It can output absolutely anything it wants, so this PR has gotten a bit broader and is now checking for type validation, allowing the injection of custom validation functions, and the creation of custom types, that conform to the expected type structure so that nodes that work with that type, will get what they are expecting as input, and outputting the type that is expected.

I edited the initial comment on the PR to reflect the current state of it.

shawnington avatar Aug 03 '24 16:08 shawnington

So this all started because in comfy_extras/nodes_mask.py the composite function requires both images to be 3 channel

which is why my argument is still let the node, itself, convert. You have a compound type that can be 1, 3 or 4... three "types". There should only exist a CONVERSION function for the API users to "convert this 'image input' to the target I want".

Node authors are the ones with the burden of types.

How else is there going to be support for ANYTYPE? Is the core going to ignore it and continue with the current hack -- which means I literally have to type cast it myself?

I don't feel that is a good design and at some point in order to support an "anytype" you would still need a CONVERSION function to be called to cast the input type into the appropriate thing -- why make everyone do this over and over.

All the work done in my parse_dynamic, parse_value and parse_parameter functions does the casting for the core types, mine and other types from authors like Kijai.

As always, don't take the sh!tty code I made, take the concept. The concept covers all current and future use-cases and doesn't blow things up every time someone has a bug about what a type should exist or be supported.

Amorano avatar Aug 03 '24 17:08 Amorano

So this all started because in comfy_extras/nodes_mask.py

I see, I didn't get it was about the "internal" methods, I don't use most of them, but maybe will after this change. I'm going to try the PR now

melMass avatar Aug 04 '24 13:08 melMass