Multi-Modal-Comparators
Multi-Modal-Comparators copied to clipboard
Unified API to facilitate usage of pre-trained "perceptor" models, a la CLIP
## installable - [ ] https://github.com/salesforce/LAVIS - https://github.com/salesforce/BLIP - https://github.com/salesforce/ALBEF - [ ] https://github.com/facebookresearch/multimodal - FLAVA - LateFusion - ALBEF - MDETR - OMNIVORE - video-gpt - [ ] https://github.com/ai-forever/ru-clip...
how to * get the tokenizer and preprocessor for a given clip * get the visual and textual encoder separately
https://github.com/archinetai/surgeon-pytorch
I know hardcoding it came from me but while Gradient Checkpointing makes things faster and use less VRAM so very useful on some use-cases, but can break things on A100...
I suspect this is the issue with clip-fa too
Hi, As part of our package to easily evaluate clip models, https://github.com/LAION-AI/CLIP_benchmark/issues/1 and my inference lib https://github.com/rom1504/clip-retrieval I'm interested to have a package like this however here is what's missing...
`pip install clip-anytorch` https://github.com/rom1504/CLIP
rather than an ambiguous _model attribute, let's just attach an attribute that returns the objects that would be returned from the native "load" function of the particular model implementation the...