client-java icon indicating copy to clipboard operation
client-java copied to clipboard

introduce tracing for client-java

Open iosmanthus opened this issue 4 years ago • 1 comments

Feature Request

Is your feature request related to a problem? Please describe:

For now, it is hard to debug a request from clients to TiKV servers. We could only use region/store ids and log timestamps to trace the life of a request which is inefficient. It is hard to tell if two logs with the same region/store ids info are produced by the same request because clients may retry a gRPC request a few times. We need a mechanism to track the lifetime of a request.

Describe the feature you'd like:

Introduce tracing for client-java and TiKV. Currently, TiKV has a pull request for the basic infrastructure to this feature: https://github.com/tikv/tikv/pull/11824.

From the client perspective, adding a tracing context to our request should be enough to track a request: https://github.com/pingcap/kvproto/blob/3fa8fa04f898c8b2f5f9b7b8576f38016cb2d5e6/proto/trace.proto#L14-L19

message TraceContext {
    // The id that is able to identify a unique request. It's usually a UUID.
    uint64 trace_id = 1;
    // The span that represents the caller's calling procedural.
    uint64 parent_id = 2;
}

The tracing could be introduced in two phases:

  1. tracing by logs While constructing a request in client-java, we should attach a generated UUID and parent_id = 0 (indicating it is the root caller of a request) to the tikv gRPC context. In addition, trace_id should be attached to the slow log of the request(both of client and server) as a property like a region id and a store id. While debugging requests, we could grep all the logs of clients and servers with the trace_id as the primary key. This process might be a little bit annoying, but it works for most scenarios.

  2. tracing by jaeger or something else Instrument the client-java with the standard of OpenTracing, and collect the span with components like jaeger. This phase might require us to build an efficient tracing library in Java just like https://github.com/tikv/minitrace-rust or https://github.com/tikv/minitrace-go

iosmanthus avatar Jan 11 '22 07:01 iosmanthus

This issue is stale because it has been open 30 days with no activity.

github-actions[bot] avatar Feb 22 '22 00:02 github-actions[bot]