vscode-ai-toolkit icon indicating copy to clipboard operation
vscode-ai-toolkit copied to clipboard

Chunking Input text.

Open pramodazad opened this issue 11 months ago • 2 comments

I wanted to experiment with large text summarization (around 7800 tokens). However, since Ollama supports only 2048 tokens, we need a way to process inputs in chunks. Currently, AITK does not support this functionality.

Feature Request:
Would it be possible to introduce a mechanism to automatically split and process large inputs in chunks? This would enable handling longer texts efficiently within the current token limit.

Use Case:

  • Enables summarization of large documents beyond the 2048-token limit.
  • Improves usability for tasks requiring long-context understanding.

Possible Solutions:

  • Implement an internal chunking mechanism to process text in smaller segments.
  • Provide an option to handle and merge chunked outputs into a coherent summary.

Let me know if this aligns with AITK's roadmap or if there's an alternative approach to handle this. Thanks!

pramodazad avatar Feb 22 '25 14:02 pramodazad

Thank you for contacting us! Any issue or feedback from you is quite important to us. We will do our best to fully respond to your issue as soon as possible. Sometimes additional investigations may be needed, we will usually get back to you within 2 days by adding comments to this issue. Please stay tuned.

Hi @pramodazad , thank you for your inputs and for using AI Toolkit. We will consider this feature request in future iterations and keep you posted.

MuyangAmigo avatar Feb 25 '25 07:02 MuyangAmigo