FasterTransformer
FasterTransformer copied to clipboard
MPT-7B model conversion?
Hello! I'd like to know how to convert the standard MPT-7b model weights to the right format to run inference with?
https://github.com/mosaicml/llm-foundry/pull/169
FasterTransformer development has transitioned to TensorRT-LLM.
MPT is supported in TensorRT-LLM. Please take a try.