model_api
model_api copied to clipboard
create_model improvements
- Implement .to(device: str), .half(), and .bf16() methods to control the execution device and convertion to FP16.
- Should work only for local inference
- "precision" should be removed from create_model() API. "device" can be saved.