zshot icon indicating copy to clipboard operation
zshot copied to clipboard

Reduce T5 model size and enhance perfomances

Open GabrielePicco opened this issue 3 years ago • 0 comments

Scenario summary

Current inference with t5 models is slow

Proposed solution

Investigate and implement solution to reduce model size and speed-up inference, some of the ideas to consider:

  • Reduce T5 model size by 3X and increase the inference speed up to 5X: https://github.com/Ki6an/fastT5
  • How to adapt a multilingual T5 model for a single language: https://towardsdatascience.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90
  • https://huggingface.co/blog/optimum-inference

GabrielePicco avatar Oct 03 '22 14:10 GabrielePicco