opencensus-go-exporter-stackdriver icon indicating copy to clipboard operation
opencensus-go-exporter-stackdriver copied to clipboard

Stats: Flaky errors when exporting to Stackdriver

Open yanweiguo opened this issue 7 years ago • 6 comments

I'm using OpenCensus Stackdriver exporter in a container running on GKE. I use cloud.google.com/go/compute/metadata to get the ProjectID and pass it to OpenCensus Stackdriver exporter. Sometimes I got following errors when I start to run my container in a pod.

2019/02/14 21:10:04 Failed to export to Stackdriver: context deadline exceeded
2019/02/14 21:10:04 Failed to export to Stackdriver: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: authentication handshake failed: read tcp 10.32.4.155:48788->74.125.129.95:443: read: connection reset by peer"

If I delete the pod and let GKE recreates a new one for me without doing anything else, sometimes it works again.

What reason could be that sometimes I got those errors sometimes not?

yanweiguo avatar Feb 14 '19 21:02 yanweiguo

@yanweiguo it seems like underlying grpc is not able to establish connection with the server. There could be number of reasons why it cannot establish the connection. But I highly doubt that it is related to go-exporter.

rghetia avatar Feb 15 '19 23:02 rghetia

@yanweiguo it seems like underlying grpc is not able to establish connection with the server. There could be number of reasons why it cannot establish the connection. But I highly doubt that it is related to go-exporter.

If it is related to go-exporter, any ideas to debug and fix it?

yanweiguo avatar Feb 16 '19 01:02 yanweiguo

you could turn on debugging for grpc by setting env variables GRPC_GO_LOG_VERBOSITY_LEVEL=99 GRPC_GO_LOG_SEVERITY_LEVEL=info

rghetia avatar Feb 16 '19 15:02 rghetia

I also get these pretty frequently...

2019-04-10 14:26:20.583 WEST 2019/04/10 13:26:20 Failed to export to Stackdriver: rpc error: code = DeadlineExceeded desc = context deadline exceeded

edevil avatar Apr 10 '19 13:04 edevil

I'm also getting these in my logs:

2019/06/13 14:46:53 stackdriver.go:420: Failed to export to Stackdriver: rpc error: code = DeadlineExceeded desc = context deadline exceeded

Based on this post, seems to be a code issue:

https://discourse.drone.io/t/fix-grpc-deadlineexceeded-error/2884

subwiz avatar Jun 13 '19 16:06 subwiz

Has anyone managed to figure out what is the underlying issue? Have been seeing these every couple of days. The link above is no longer working.

hammadzz avatar Jan 10 '23 10:01 hammadzz