Anderson

Results 2 issues of Anderson

Clear cuda cache so user don't need to restart program when running out of vram.

效果真的又快又好,打算日常使用,所以增加一个提供 OpenAI 兼容的推理服务接口。