kernel-memory icon indicating copy to clipboard operation
kernel-memory copied to clipboard

Do you have plan to support AWS services?

Open tonyqus opened this issue 2 years ago • 3 comments

For example, content storage with AWS S3?

tonyqus avatar Oct 15 '23 19:10 tonyqus

hi @tonyqus we might not have any short term plan, but the memory builder allows to plug in custom implementations, so I would suggest developing that and similar extensions as nugets, adding the corresponding With...() extension methods.

dluc avatar Oct 19 '23 23:10 dluc

how about the TextGenerations? i was able to reuse my custom classes that i use with SK. that's nice that the interfaces are the same but i'm getting some weirdness. at each memory.ImportDocumentAsync, i see the embedding results from my GenerateEmbeddingsAsync implementation. i'm assuming the kernelMemory has some built-in storage behind the scenes because i'm not explicitly calling it here like i would in SK. so my questions are:

  1. why do i need to have .WithCustomTextGeneration(new AmazonTitanTextGeneration())? if i remove this line, i'll get an error telling me i don't have an ITextGeneration registered
  2. await memory.AskAsync seems to fire GenerateEmbeddingsAsync again. why? is this method also supposed to search the storage?
  3. why am i getting INFO NOT FOUND?

i implemented it like this:

    var memory = new KernelMemoryBuilder()
        .WithCustomTextGeneration(new AmazonClaudeV2TextGeneration())
        .WithCustomEmbeddingGeneration(new AmazonTitanTextEmbeddingGeneration())
        .Build();

    await memory.ImportDocumentAsync("context1.pdf");
    await memory.ImportDocumentAsync("context2.pdf");
    await memory.ImportDocumentAsync("context3.pdf");

    var answer1 = await memory.AskAsync("question about context");

but i get the error: INFO NOT FOUND

curlyfro avatar Nov 11 '23 15:11 curlyfro

hi @curlyfro

at each memory.ImportDocumentAsync, i see the embedding results from my GenerateEmbeddingsAsync implementation

that's correct. When importing a document, the text is exported, chunked, and for each chunk an embedding is generated with AmazonTitanTextEmbeddingGeneration and the embedding is saved. Since you're not specifying a vector db storage, the vector is stored in RAM, hence it's lost after the program ends.

why do i need to have .WithCustomTextGeneration(new AmazonTitanTextGeneration())?

The text generator is used in 2 places:

  1. To Summarize the document, if you are using summaries. Summaries where enabled by default until a couple of versions ago. In the latest versions a document is summarized only if you pass the corresponding step, and looks like you are not.
  2. To Answer questions. A question is generated using the RAG pattern, that involves a call to a text generator.

await memory.AskAsync seems to fire GenerateEmbeddingsAsync again. why? is this method also supposed to search the storage?

That's correct. The system generates an embedding of the question, in order to find similar chunks. The question is passed to the embedding generator, and then the system searches the vector DB for similar embeddings.

why am i getting INFO NOT FOUND?

It really depends on the content of your files and the question. Here's some options:

  • If the files don't contain an answer that's the usual reason.
  • there could also be a bug. If you try with OpenAI what happens?
  • how good are AmazonTitan embeddings at capturing the meaning of documents and the meaning of questions? maybe Titan needs some special settings, I'm not familiar with it.

dlucr avatar Nov 12 '23 21:11 dlucr