pixie icon indicating copy to clipboard operation
pixie copied to clipboard

gRPC-c probes

Open orishuss opened this issue 3 years ago • 2 comments

  1. Initiate PerCPU variables of the gRPC-c eBPF module.
  2. Look for new processes that use the gRPC-c library. Determine the library's version by its MD5 hash. Currently, only 4 library hashes have been added. In the future, we will need to either develop a mechanism that automatically finds hashes, or determine the library's version dynamically.
  3. When a process with a gRPC-c library is found, attach 6 needed probes.

Where to start

I strongly encourage seeing the solution work (below) first. This way we will make sure everything is fine (from the last 2 PRs as well).

  • First PR: https://github.com/pixie-io/pixie/pull/415
  • Second PR: https://github.com/pixie-io/pixie/pull/432

This PR has only 2 altered files: the uprobe manager.

Seeing the entire gRPC-c solution work

With this code, the gRPC-c library data should be visible to Stirling. To see it work, I used 2 dockers (client and server) of the simple "route guide" gRPC example project from the gRPC-c github repository.

  1. Download the tar file and open it. It contains a folder grpc-python.
  2. Run the build-dockers.sh script.
  3. Run the server: docker run -it -p 50051:50051 python-grpc-poc-server-v1-19. This is the server, which Stirling will attach probes to.
  4. Run Stirling in the px-dev-docker (this can also be done before all of the above, doesn't really matter). You can also run Stirling in any other environment where the server is visible to it, for me the px-dev-docker was the simplest solution.
  5. Run the client: docker run -it --network host python-grpc-poc-client. Stirling will not attach to the client, because the client's version is unfamiliar to Stirling (you'll also see the Stirling log that says the MD5 of the gRPC-c library is unknown).
  6. Now traffic between client and server should start, and Stirling outputs it (given the http_events table is enabled).

Tell me what you want to do with this tar example, perhaps we should add it to the project's scripts directory.

Example stdout: Screen Shot 2022-08-01 at 14 19 48

orishuss avatar Aug 01 '22 08:08 orishuss

Can one of the admins verify this patch?

pixie-io-buildbot avatar Aug 01 '22 08:08 pixie-io-buildbot

@yzhao1012 I fixed all current comments

orishuss avatar Aug 07 '22 13:08 orishuss

@yzhao1012 thanks, good comments! I addressed all current comments, if you have more please feel free to add them.

orishuss avatar Aug 16 '22 10:08 orishuss

@yzhao1012 thanks, good comments! I addressed all current comments, if you have more please feel free to add them.

Thanks for the quick actions. I am testing this PR internally, and was running into hiccups (a few weird build failure, and pending a length performance regression tests). I expect these hiccups to be cleared soon, and should not be related to this PR either. I'll approve soon after fixing these.

yzhao1012 avatar Aug 17 '22 17:08 yzhao1012

Patch in the following diff to fix a GCC build failure. We have a gcc build pipeline to validate the c++ code with gcc compiler (in addition to the default clang compiler).

diff --git a/src/stirling/source_connectors/socket_tracer/uprobe_manager.cc b/src/stirling/source_connectors/socket_tracer/uprobe_manager.cc
index aeffedbda..297344a7e 100644
--- a/src/stirling/source_connectors/socket_tracer/uprobe_manager.cc
+++ b/src/stirling/source_connectors/socket_tracer/uprobe_manager.cc
@@ -604,11 +604,8 @@ bool UProbeManager::InitiateGrpcCPercpuMetadataHeap() {
 bool UProbeManager::InitiateGrpcCPercpuEventDataHeap() {
   struct grpc_c_data_slice_t empty_slice = {.length = 0, .bytes = {0}};
   auto array = bcc_->GetPerCPUArrayTable<struct grpc_c_event_data_t>(kGrpcCEventDataHeapName);
-  struct grpc_c_event_data_t empty_value = {.stream_id = 0,
-                                            .timestamp = 0,
-                                            .direction = kEgress,
-                                            .position_in_stream = 0,
-                                            .slice = empty_slice};
+  struct grpc_c_event_data_t empty_value = {};
+  empty_value.slice = empty_slice;
   std::vector<struct grpc_c_event_data_t> empty_values(bpf_tools::BCCWrapper::kCPUCount,
                                                        empty_value);
   auto update_result = array.update_value(0, empty_values);
@@ -624,8 +621,10 @@ bool UProbeManager::InitiateGrpcCPercpuHeaderEventDataHeap() {
   struct grpc_c_metadata_item_t empty_metadata_item = {.key = {0}, .value = {0}};
   auto array =
       bcc_->GetPerCPUArrayTable<struct grpc_c_header_event_data_t>(kGrpcCHeaderEventDataHeapName);
-  struct grpc_c_header_event_data_t empty_value = {
-      .stream_id = 0, .timestamp = 0, .direction = kEgress, .header = empty_metadata_item};
+  struct grpc_c_header_event_data_t empty_value = {};
+  empty_value.direction = kEgress;
+  empty_value.header = empty_metadata_item;
+
   std::vector<struct grpc_c_header_event_data_t> empty_values(bpf_tools::BCCWrapper::kCPUCount,
                                                               empty_value);
   auto update_result = array.update_value(0, empty_values);

yzhao1012 avatar Aug 17 '22 19:08 yzhao1012

@yzhao1012 I fixed all new comments, except the one with the diff file you added. I think you ran an old version when you posted that file, as the diff relates to a part of the code that is no longer there (the InitiateGrpcCPercpuMetadataHeap function does not exist anymore).

orishuss avatar Aug 18 '22 09:08 orishuss

Thanks @yzhao1012 and @orishuss. Will pull this in soon.

oazizi000 avatar Aug 18 '22 22:08 oazizi000

@oazizi000 merged this PR in https://github.com/pixie-io/pixie/commit/cf3f844d8f6d7460517ca895243226ff804d1247. Close.

I am working on pulling in the docker images produced by dockerfile into native bazel py3_image. After that, @orishuss can add bpf tests that tests this feature end2end.

yzhao1012 avatar Aug 19 '22 21:08 yzhao1012