rig icon indicating copy to clipboard operation
rig copied to clipboard

refactor: Add `Request` and `Response` types for embeddings

Open cvauclair opened this issue 1 year ago • 0 comments

  • [x] I have looked for existing issues (including closed) about this

Feature Request

Bring the low-level embedding API closer to the completion API in terms of completeness and features.

Motivation

Unlike the completion API which is neatly divided into low level types/traits (e.g.: CompletionRequest and CompletionResponse types, CompletionModel trait, etc.) and high level types/traits (e.g.: Chat and Prompt traits), the embeddings API does not have such a distinction.

Instead, the embeddings API is centered around the EmbeddingModel trait (which implements a high level interface which is more analogous to the Prompt trait, than the low level CompletionModel trait) and the EmbeddingsBuilder which is analogous to the CompletionRequestBuilder but with a higher level interface compared to the it's completion API counterpart.

This leads to major drawbacks:

  1. The types/traits become bloated as they need to implement both low and high level functionality
  2. Since the low level request and response types are missing, Rig does not provide any way for users to track things like embedding model token usage unlike the low level completion API

Proposal

  • Add EmbeddingRequest type
  • Add EmbeddingResponse type
  • Rename EmbeddingsBuilder to EmbeddingRequestBuilder
    • Change the build() method to return an EmbeddingRequest
    • Add the send() method
  • Change the EmbeddingModel to the following:
    pub trait EmbeddingModel: Clone + Send + Sync {
        /// The raw response type returned by the underlying embedding model.
        type Response: Send + Sync;
    
        /// Generates an embedding response for the given embedding request.
        fn embedding(
            &self,
            request: EmbeddingRequest,
        ) -> impl std::future::Future<Output = Result<EmbeddingResponse<Self::Response>, EmbeddingError>>
               + Send;
    
        /// Generates a embedding request builder.
        fn embedding_request(&self, prompt: &str) -> EmbeddingRequestBuilder<Self> {
            EmbeddingRequestBuilder::new(self.clone())
        }
    }
    
    • Note: This is analogous to the high level CompletionModel trait from the completion API
  • Add new Embedding trait (final name tbd)
    pub trait Embedding: Send + Sync {
        /// Generates embeddings for the given documents
        fn embedding<T: Embed + Send>(
            &self, 
            documents: impl IntoIterator<Item = T>,
        ) -> impl std::future::Future<Output = Result<Vec<(T, OneOrMany<Embedding>)>, EmbeddingError>>
               + Send;
    }
    
    • Note: This is analogous to the high level Prompt trait from the completion API

Alternatives

N/A

cvauclair avatar Oct 24 '24 22:10 cvauclair