transformer-deploy icon indicating copy to clipboard operation
transformer-deploy copied to clipboard

[Question] Documentation for generative model API and parameters?

Open tanmayb123 opened this issue 3 years ago • 1 comments

I can't seem to find any documentation around how I would specify parameters such as max generation length, stop tokens, temperature, etc., for decoder-based models like GPT-2. Currently my API requests are only generating a single token, and I'd obviously like to generate more (up until a specified stop token preferably).

tanmayb123 avatar Aug 20 '22 20:08 tanmayb123

@tanmayb123 Currently, we are not planning to open those parameters, you can try either to add parameters with Triton or to try to pass the wanted parameters in a json way.

ayoub-louati avatar Aug 31 '22 15:08 ayoub-louati