Results 18 comments of dolpher

Maybe we should consider a stable version for milestone releases? See also #625

I'm using helm install to test: https://github.com/opea-project/GenAIInfra/tree/main/helm-charts/common/tgi Using command like this: helm install tgi tgi --set LLM_MODEL_ID=ise-uiuc/Magicoder-S-DS-6.7B

Error message/pod logs: {"timestamp":"2024-08-19T05:38:39.361300Z","level":"INFO","fields":{"message":"Args {\n model_id: \"ise-uiuc/Magicoder-S-DS-6.7B\",\n revision: None,\n validation_workers: 2,\n sharded: None,\n num_shard: None,\n quantize: None,\n speculate: None,\n dtype: None,\n trust_remote_code: false,\n max_concurrent_requests: 128,\n max_best_of: 2,\n max_stop_sequences: 4,\n max_top_n_tokens:...

We now use helm chart to deploy and test. Manifests can be generated by "helm template" and will not be maintained.

I'll look at this issue. Could you provide more information about what issue the empty securityContexts cause? I've verified this runs fine with Gaudi-device-plugin without any special priviledge settings, will...

The llm.py link should be this one, not in the vllm-ray directory: https://github.com/opea-project/GenAIComps/blob/main/comps/llms/text-generation/vllm/langchain/llm.py And this looks like an docker image issue and reassign to owner [email protected]

k8s deployment merged to helm charts.

@amberjain1 Which file do you mean? This link has reference of installing Gaudi software, but I assume you're expecting to see it elsewhere. https://github.com/opea-project/GenAIInfra/blob/main/README.md#setup-kubernetes-cluster

> @yongfengdu Is this something you will complete in October? If not, let's try to assign someone from the Hackathon I'm not covering the benchmarking part, feel free to assign...

Looks like tgi is trying to access more locations after enabled TDX. Could you try comment out these lines for tgi (Line 1289-1298)? https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/kubernetes/intel/cpu/xeon/manifest/chatqna.yaml#L1289 securityContext: allowPrivilegeEscalation: false capabilities: drop: -...