Installations going in Unknown Phase
This is in continuation to the issue: https://github.com/getporter/operator/issues/285
I tried adding installationServiceAccount: porter-agent in my customaction agent config.
After that all my installations are going in unknown phase and no pods are getting created. I did rollback the change but now no pods are getting created and I don't even see the unauthorized error.
What can be the issue and where do I see some logs which can help me troubleshooting this?
Name: azuredep-10
Namespace: unifieddeployment
Labels: <none>
Annotations: <none>
API Version: getporter.org/v1
Kind: Installation
Metadata:
Creation Timestamp: 2023-10-29T11:29:41Z
Finalizers:
getporter.org/finalizer
Generation: 1
Resource Version: 571851
UID: af593a0f-ae60-47dc-86b1-7ead0f329975
Spec:
Agent Config:
Name: customagent
Bundle:
Repository: crporterpoc.azurecr.io/porter-hello
Version: v0.1.0
Credential Sets:
azurecredsetnew5
Name: azuredep-10
Namespace: unifieddeployment
Parameters:
Location: EastUs2
storage_account_name: porterdemo10
storage_container_name: container001
storage_rg: porterdemo10
Schema Version: 1.0.2
Status:
Action:
Name: azuredep-10-bq26w
Observed Generation: 1
Phase: Unknown
Events: <none>
I just noticed porter controller has gone to CrashLoopBackOff state. Please tell how do I bring it back to running state. I am not finding meaningful logs to bring it back in running state.
`kubectl logs porter-operator-controller-manager-744d4cc48f-92466 -n porter-operator-system
Defaulted container "kube-rbac-proxy" out of: kube-rbac-proxy, manager
I1029 10:35:58.833293 1 main.go:186] Valid token audiences:
I1029 10:35:58.833604 1 main.go:232] Generating self signed cert as no cert is provided
I1029 10:35:59.422029 1 main.go:281] Starting TCP socket on 0.0.0.0:8443
I1029 10:35:59.422728 1 main.go:288] Listening securely on 0.0.0.0:8443
Some logs
10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z ERROR Reconciler error {"controller": "agentaction", "controllerGroup": "getporter.org", "controllerKind": "AgentAction", "AgentAction": {"name":"hello-llama5-wwds4","namespace":"unifieddeployment"}, "namespace": "unifieddeployment", "name": "hello-llama5-wwds4", "reconcileID": "aeabfc95-5d36-4b31-8018-fb8d0f7b8631", "error": "resolved agent configuration is not ready to be used. Waiting for the next retry", "errorVerbose": "resolved agent configuration is not ready to be used. Waiting for the next retry\nget.porter.sh/operator/controllers.(*AgentActionReconciler).resolveAgentConfig\n\t/workspace/controllers/agentaction_controller.go:535\nget.porter.sh/operator/controllers.(*AgentActionReconciler).runPorter\n\t/workspace/controllers/agentaction_controller.go:185\nget.porter.sh/operator/controllers.(*AgentActionReconciler).Reconcile\n\t/workspace/controllers/agentaction_controller.go:95\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO controllers.Installation Copied status from agent action {"installation": "azuredep-66", "namespace": "unifieddeployment", "resourceVersion": "562522", "generation": 1, "observedGeneration": 1, "agentaction": "azuredep-66-ngwtn", "action": "azuredep-66-ngwtn", "phase": "Failed", "conditions": ["Scheduled", "Started", "Failed"]} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z DEBUG controllers.Installation Reconciliation complete: A porter agent has already been dispatched. {"installation": "azuredep-66", "namespace": "unifieddeployment", "resourceVersion": "562522", "generation": 1, "observedGeneration": 1, "agentaction": "azuredep-66-ngwtn"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO controllers.Installation Reconciling installation {"installation": "azuredep-36", "namespace": "unifieddeployment", "resourceVersion": "623510", "generation": 1, "observedGeneration": 1} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z DEBUG controllers.Installation Found existing agent action {"installation": "azuredep-36", "namespace": "unifieddeployment", "resourceVersion": "623510", "generation": 1, "observedGeneration": 1, "agentaction": "azuredep-36-zv9lf", "namespace": "unifieddeployment"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO controllers.Installation Syncing AgentAction status with Installation {"installation": "azuredep-36", "namespace": "unifieddeployment", "resourceVersion": "623510", "generation": 1, "observedGeneration": 1, "agentaction": "azuredep-36-zv9lf"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO controllers.Installation Copied status from agent action {"installation": "azuredep-36", "namespace": "unifieddeployment", "resourceVersion": "623510", "generation": 1, "observedGeneration": 1, "agentaction": "azuredep-36-zv9lf", "action": "azuredep-36-zv9lf", "phase": "Unknown", "conditions": []} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO controllers.Installation Patching installation status {"installation": "azuredep-36", "namespace": "unifieddeployment", "resourceVersion": "623510", "generation": 1, "observedGeneration": 1, "agentaction": "azuredep-36-zv9lf"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z DEBUG controllers.AgentConfig Applied patch {"agent config": "customagent", "namespace": "unifieddeployment", "resourceVersion": "573420", "generation": 3, "observedGeneration": 3, "status": false, "data": "{}"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO controllers.AgentConfig Creating porter agent action {"agent config": "customagent", "namespace": "unifieddeployment", "resourceVersion": "573420", "generation": 3, "observedGeneration": 3, "status": false} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference {"controller": "agentconfig", "controllerGroup": "getporter.org", "controllerKind": "AgentConfig", "AgentConfig": {"name":"customagent","namespace":"unifieddeployment"}, "namespace": "unifieddeployment", "name": "customagent", "reconcileID": "c9f68e27-7be1-4d4a-b0f3-e2280646fa0d"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d panic: runtime error: invalid memory address or nil pointer dereference [recovered] 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d panic: runtime error: invalid memory address or nil pointer dereference 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x146bdf5] 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d
`
Please guide here. It will be a great help. @troy0820 @sgettys @schristoff
Hi @hemantkathuria - apologies that it's taken me a bit to look at this. I want you to know I'm going to reviewing this and will get back to you soon with either follow up questions or a path forward. Thank you for your patience and your amazing in depth issues :)