swiftLLM
swiftLLM copied to clipboard
Question about performance
Could you please provide the relevant code for performance testing? Because during my testing process, the performance seems to be worse than that of vLLM.
I disabled CUDA Graph (--enforce-eager) and multi-step scheduling when benchmarking vLLM. Could you please test this configuration?