here4dadata

Results 2 issues of here4dadata

After an engine is built with the `--gather_all_token_logits` and a call is made through the backend with `return_context_logits: True`, `return_generation_logits: True`, It seems as though to piece together the full...

question
triaged

When following the steps highlighted in [the examples](https://github.com/triton-inference-server/tensorrtllm_backend/blob/v0.15.0/docs/multimodal.md) for `mllama` we run into two issues. 1. The `cross_kv_cache_fraction` parameter is expected to be set in `tensorrt_llm/config.pbtxt`, wheras it is not...