continue Feature: Different embedding models for VectorDB Creation and queries.

Validations

[X] I believe this is a way to improve. I'll try to join the Continue Discord for questions
[X] I'm not able to find an open issue that requests the same enhancement

Problem

Currently the nemo-retriever architecture proposes a method to generate different embeddings based on if the input type is a passage (during vectorDB creation) or a query (during RAG using preexisting vectorDB). This currently is not supposed by the BaseEmbeddingProvider.

Reference: https://docs.nvidia.com/nim/nemo-retriever/text-embedding/latest/overview.html

This is a standard practice in most embedding models, langchain implements two functions for example, one for documents (passage) and other for queries.

Reference: https://python.langchain.com/v0.1/docs/modules/data_connection/text_embedding/

WAR to use nvidia nim for embedding generation

The following can be added to ~/.continue/config.ts

export function modifyConfig(config: Config): Config {
  config.embeddingsProvider = {
    id: 'nvidia-embeddings-provider',
    providerName: 'openai',
    maxChunkSize: 2048,
    embed: async (chunks: string[]) => {
      if (chunks.length === 0) {
        console.log('No chunks to embed');
        return []; // or throw an error, depending on your requirements
      }

      const apiKey = '<YOUR API KEY>';
      const url = 'https://integrate.api.nvidia.com/v1/embeddings/';

      const data = JSON.stringify({
        input: chunks,
        input_type: 'query',
        model: 'nvidia/nv-embedqa-mistral-7b-v2',
      });
      const options = {
        method: 'POST',
        headers: {
          'Authorization': `Bearer ${apiKey}`,
          'Content-Type': 'application/json',
          'api-key': apiKey,
        },
        body: data,
      };

      try {
        const response = await fetch(url, options);
        const responseData = await response.json();
        const embeddings = responseData.data.map((item) => item.embedding);
        console.log(embeddings)
        return embeddings;
      } catch (error) {
        console.error('Error:', error);
        throw error;
      }
    },
  };

  return config;
}

    input_type: 'query',

The above line ensures that all embeddings are of type query which leads to suboptimal performance.

Solution

Distinctions need to be made whenever the embeddings provider's embed function is called to differentiate between a passage and a query call for embeddings generation.

Sep 06 '24 12:09 adarshdotexe

@adarshdotexe another workaround is to use the baai/bge-m3 model, which is symmetric and does not require the input_type=passage/query prompting

  "embeddingsProvider": {
    "provider": "openai",
    "model": "baai/bge-m3",
    "apiBase": "https://integrate.api.nvidia.com/v1",
    "apiKey": "nvapi-..."
  }

Sep 10 '24 09:09 mattf

which is better for code

Oct 22 '24 12:10 lord-dubious

which is better for code

asymmetric models will have better accuracy when used in an asymmetric way.

as for the accuracy on code vs other data sets, that's a feature of the model itself.

Oct 22 '24 13:10 mattf

which is better for code

asymmetric models will have better accuracy when used in an asymmetric way.

as for the accuracy on code vs other data sets, that's a feature of the model itself.

can u suggest to me a model that gets your attention that i can locally run i hear voyager2 code is cool

Oct 27 '24 18:10 lord-dubious

This issue hasn't been updated in 90 days and will be closed after an additional 10 days without activity. If it's still important, please leave a comment and share any new information that would help us address the issue.

Mar 03 '25 04:03 github-actions[bot]

This issue was closed because it wasn't updated for 10 days after being marked stale. If it's still important, please reopen + comment and we'll gladly take another look!

Mar 14 '25 02:03 github-actions[bot]