Bruce D'Amora

Results 14 comments of Bruce D'Amora

In cluster-wide proxy object: spec: httpProxy: http://10.0.5.50:3128 httpsProxy: http://10.0.5.50:3128 noProxy: foccluster.com,foc.foccluster.com trustedCA: name: "" It is not set correctly in the driver Daemonset. We added 172.30.0.1 since it was trying...

@kpouget kind: ClusterServiceVersion name: gpu-operator-certified.v1.7.0 namespace: openshift-operators - apiVersion: operators.coreos.com/v1 kind: OperatorCondition name: gpu-operator-certified.v1.7.0 namespace: openshift-operators

the pod failed to pull image:Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 92s default-scheduler Successfully assigned gpu-operator-resources/cluster-entitled-build-pod to smicro06 Normal AddedInterface 91s multus Add...

same issue as before:Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 51s default-scheduler Successfully assigned gpu-operator-resources/cluster-entitled-build-pod to smicro06 Normal AddedInterface 49s multus Add eth0 [10.129.2.224/23]...

oc logs cluster-entitled-build-pod Updating Subscription Management repositories. Unable to read consumer identity Subscription Manager is operating in container mode. Red Hat Enterprise Linux 8 for x86_64 - BaseOS 0.0 B/s...

why did this start with 4.7.9? GPU install was working before that and nothing has changed regarding cluster DNS, DHCP, or HTTP servers

> @damora any further update on this? Were you able to verify entitlements and get driver install working? still trying to debug this. It appears that the running container cannot...

The error we are seeing is: state: waiting: message: 'rpc error: code = Unknown desc = error pinging docker registry nvcr.io: Get "https://nvcr.io/v2/": Method Not Allowed' reason: ErrImagePull Seems to...

> @damora image pull will happen before the container is started, with this error looks like kubelet/CRI-O is not able to pull the image in this case to launch driver...

@shivamerla the error is occurring in this container: nvidia-driver-daemonset-2t6fr 0/1 ImagePullBackOff 0 55m so are you saying that it is not trying to pull the image to start this container?