Consider to provide official CodeLlama inference speed up support.
Will release after test.
The example of CodeLlama can be found here.