retrieval-augmented-diffusion-models
retrieval-augmented-diffusion-models copied to clipboard
Combining the CLIP embeddings of text and KNNs: conditional_retrieval_encoder
What is the idea behind the unimplemented conditional_retrieval_encoder here? Would this be another encoder to combine the CLIP embs of the original text query and KNNs?