FasterTransformer
FasterTransformer copied to clipboard
Int4 Support
Are there plans to add Int4 support to FasterTransformer? This would be very useful in terms of speed and memory usage.
Thank you for the suggestion. We will consider it.
upvote this feature request.
FasterTransformer development has transitioned to TensorRT-LLM.
Int4 (AWQ) is supported in TensorRT-LLM, please take a try.