here4dadata
Results
2
issues of
here4dadata
After an engine is built with the `--gather_all_token_logits` and a call is made through the backend with `return_context_logits: True`, `return_generation_logits: True`, It seems as though to piece together the full...
question
triaged
When following the steps highlighted in [the examples](https://github.com/triton-inference-server/tensorrtllm_backend/blob/v0.15.0/docs/multimodal.md) for `mllama` we run into two issues. 1. The `cross_kv_cache_fraction` parameter is expected to be set in `tensorrt_llm/config.pbtxt`, wheras it is not...