Matagi1996 comments

Results 11 comments of


                                            Matagi1996

Layers in Swin Transformer

To achieve receptive field the window partition switches each second block inside a stage, this way information between the chunks of window divided tokens can be exchanged slowly with each...

GPU utilisation drop between epochs

You could log your actual disk read/write speed and see if your Dataloaders are IO bound, i had this issue as well when loading big images from local drive while...

Is there only one input shape for vit to correctly output mask？

Hey, thank you very much for the tipp with Layernorms beeing the reason results are not good in fp16. I wrote a little script using forward hooks to convert between...

torch.compile() throws exception when LigerKernel is used

Had the same problem of wanting to use Liger Kernels with torch.compile I followed the flash attn repo/torch doc to register Liger Implementation of RMS_Norm / Swiglu (needed thouse 2)...

How to get all masks directly?

The automatic mask predictor is sampling a grid of points and calling the decoder again and again, I have actually tried this with the Onnx model (not implemented here) but...

Should this Work for Hiera

The Hiera MAE Decoder Merges Every Block Output with a 2D conv to same (HW) and C of decoder Dimension while staying inside the Mask units and just sums them...

Licensing for the EU

Braindead AI regulation regarding Data, Safety, (you name it) is (most likely) responsible. You could vote out your gouvernments, move to another country or just use the models and Tencent...

Image Editor as Output Component

Thank you for the reply. Even when wrapping it in the Editor value, it will show the Image but have the "Upload Image" still in the background and the cropping...

Title: How to integrate Mask2Former with DinoV3 backbone?

Does it even make sense to use different Layer outputs of the Backbone for Finetuning? My Intuition for VITs was that the feature maps are refined layer by layer instead...

Does the agent actually work? Regarding the limitations of the prompt.

I tested it on some own sample tasks with Qwen3VL-8B (as in the notebook) The implementation seems to work sometimes, but is realy brittle in reality. The system prompt, a...