pixie Stirling occassionally segfaults on GKE

I'm running standalone mode of Stirling and then deploying it on GKE. I noticed that it occassionally segfaults. I only noticed this when using the default google container optimized image that gets deployed on GKE. My other cluster that is using Ubuntu-based image on GKE seems to not have this issue.

It might be related to this issue that I filed few weeks back: https://github.com/pixie-io/pixie/issues/314

In my cluster, I only enabled Cassandra and Postgresl tracing (stirling_enable_cass_tracing and stirling_enable_pgsql_tracing set to true).

The postgres workload that I run is similar to the one I posted in #314. For cassandra workload, I installed temporal helm chart, which internally installs a cassandra cluster): https://github.com/temporalio/helm-charts

I am unable to get the symbol of the backtrace, but it seems to point to the dynamically linked library.

E1008 16:52:52.605655 402931 signal_action.cc:63] Caught Segmentation fault, suspect faulting address 0x249300. Trace:

PC: @ 0x7fde97440f8b (unknown) (unknown)

PC: @ 0x5641abf (unknown) threadstacks::StackTraceCollector::Collect() @ 0x561095e (unknown) px::SignalAction::SigHandler() @ 0x7fdebec2b110 (unknown) __restore_rt @ 0x7fde975d9225 (unknown) (unknown) @ 0x7fde974aad71 (unknown) (unknown) @ 0x7fde97440f8b (unknown) (unknown)

Oct 25 '21 17:10 harold-kfuse

You seem are running stirling_wrapper? Could you confirm what exactly you did to run it? (yaml file, exact kubectl command etc., ideally someone can reproduce it without chances of mistakes)

Also, could you provide the full crash log. What you provided does not match what we usually saw from a stirling_wrapper crash, which usually includes hundreds lines of stack traces (from multiple threads).

Oct 25 '21 17:10 yzhao1012

It is based on stirling_wrapper, but more stripped down. It registers table callback from Stirling and log the records.

E1025 05:31:15.667604 2956193 signal_action.cc:63] Caught Segmentation fault, suspect faulting address 0x249300. Trace:

PC: @ 0x7f4327411f8b (unknown) (unknown)

Threads: 275947 Stack trace: PC: @ 0x5ca0ae1 (unknown) master_thread @ 0x7f434ebf0f27 (unknown) start_thread @ 0x7f434eab831f (unknown) __clone

Threads: 275945, 275946 Stack trace: PC: @ 0x5ca1c94 (unknown) worker_thread_run @ 0x5ca0a61 (unknown) worker_thread @ 0x7f434ebf0f27 (unknown) start_thread @ 0x7f434eab831f (unknown) __clone

Threads: 2956193 Stack trace: PC: @ 0x5ced88f (unknown) threadstacks::StackTraceCollector::Collect() @ 0x5cbb1ee (unknown) px::SignalAction::SigHandler() @ 0x7f434ebfc110 (unknown) __restore_rt @ 0x7f43275aa225 (unknown) (unknown) @ 0x7f432747bd71 (unknown) (unknown) @ 0x7f4327411f8b (unknown) (unknown)

Threads: 273010 Stack trace: PC: @ 0x1a1fe1d (unknown) uv_run @ 0x1a0f4f9 (unknown) px::event::LibuvScheduler::Run() @ 0x1a10339 (unknown) px::event::LibuvDispatcher::Run() @ 0x19aef0c (unknown) Binary::Run() @ 0x19a4066 (unknown) main @ 0x7f434e9e1e0b (unknown) __libc_start_main @ 0x19a3eae (unknown) _start

Threads: 273043 Stack trace: PC: @ 0x5cf6f72 (unknown) std::__invoke_impl<>() @ 0x5cf6ec2 (unknown) std::__invoke<>() @ 0x5cf6e85 (unknown) std::thread::_Invoker<>::_M_invoke<>() @ 0x5cf6e35 (unknown) std::thread::_Invoker<>::operator()() @ 0x5cf6cfe (unknown) std::thread::_State_impl<>::_M_run() @ 0x60706c4 (unknown) execute_native_thread_routine @ 0x7f434ebf0f27 (unknown) start_thread @ 0x7f434eab831f (unknown) __clone

Threads: 275955 Stack trace: Stack trace: PC: @ 0x1f4d742 (unknown) px::stirling::(anonymous namespace)::SleepForDuration() @ 0x1f4d125 (unknown) px::stirling::StirlingImpl::RunCore() @ 0x1fadb07 (unknown) std::__invoke_impl<>() @ 0x1fada12 (unknown) std::__invoke<>() @ 0x1fad9d5 (unknown) std::thread::_Invoker<>::_M_invoke<>() @ 0x1fad985 (unknown) std::thread::_Invoker<>::operator()() @ 0x1fad86e (unknown) std::thread::_State_impl<>::_M_run() @ 0x60706c4 (unknown) execute_native_thread_routine @ 0x7f434ebf0f27 (unknown) start_thread @ 0x7f434eab831f (unknown) __clone

Oct 25 '21 17:10 harold-kfuse

I'm using the following code to create stirling in my binary:

Binary::Binary()
     : stirling_(px::stirling::Stirling::Create(
          px::stirling::CreateSourceRegistry(
              px::stirling::GetSourceNamesForGroup(
                  px::stirling::SourceConnectorGroup::kTracers)).
          ConsumeValueOrDie())),
     time_system_(std::make_unique<px::event::RealTimeSystem>()),
    api_(std::make_unique<px::event::APIImpl>(time_system_.get())),
    dispatcher_(api_->AllocateDispatcher("binary")) {
 
  px::stirling::stirlingpb::Publish publication;
  stirling_->GetPublishProto(&publication);
  px::stirling::IndexPublication(publication, &table_info_map_);
  stirling_->RegisterDataPushCallback(
      std::bind(&Binary::CallBack, this,
                std::placeholders::_1,
                std::placeholders::_2,
                std::placeholders::_3));
}

void Binary::Run() {
  PL_EXIT_IF_ERROR(stirling_->RunAsThread());
  dispatcher_->Run(px::event::Dispatcher::RunType::Block);
}

px::Status Binary::Callback(uint64_t table_id, px::types::TabletID /* tablet_id */,
    std::unique_ptr<px::types::ColumnWrapperRecordBatch> record_batch) {
  auto iter = table_info_map_.find(table_id);
  if (iter == table_info_map_.end()) {
    return px::error::Internal("Encountered unknown table id $0", table_id);
  }
  const px::stirling::stirlingpb::InfoClass& table_info = iter->second;
  LOG(INFO) << px::stirling::ToString(table_info.schema().name(),
                                      table_info.schema(), *record_batch);
 
 }

main:

class FatalErrorHandler : public px::FatalErrorHandlerInterface {
 public:
  FatalErrorHandler() = default;
  void OnFatalError() const override {
    // Stack trace will print automatically; any additional state dumps can be done here.
    // Note that actions here must be async-signal-safe and must not allocate memory.
  }
};

int main(int argc, char** argv) {
  px::EnvironmentGuard env_guard(&argc, argv);

  FatalErrorHandler err_handler;
  // This covers signals such as SIGSEGV and other fatal errors.
  // We print the stack trace and die.
  auto signal_action = std::make_unique<px::SignalAction>();
  signal_action->RegisterFatalErrorHandler(err_handler);

  Binary binary;
  binary.Run();
}

Oct 25 '21 17:10 harold-kfuse

To deploy it, I created a yaml based on the pem_daemonset.yaml to deploy a daemonset. I used my own image.

Oct 25 '21 17:10 harold-kfuse

@yzhao1012 @oazizi000

I've created a container image of the binary that just registers a callback from stirling and logs the records. I have uploaded it in dockerhub. index.docker.io/haroldkf/simple_binary:0.1.0-142d720

This is the daemonset yaml I used to deploy the binary on my GKE cluster.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: simple-binary
  namespace: simple
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/instance: simple-binary
      app.kubernetes.io/name: simple-binary
  template:
    metadata:
      labels:
        app.kubernetes.io/instance: simple-binary
        app.kubernetes.io/name: simple-binary
    spec:
      containers:
      - args:
        - --stirling_enable_cass_tracing=true
        - --stirling_enable_dns_tracing=false
        - --stirling_enable_http_tracing=true
        - --stirling_enable_http2_tracing=true
        - --stirling_enable_kafka_tracing=true
        - --stirling_enable_mysql_tracing=false
        - --stirling_enable_nats_tracing=false
        - --stirling_enable_pgsql_tracing=true
        - --stirling_enable_redis_tracing=false
        env:
        - name: TCMALLOC_SAMPLE_PARAMETER
          value: "1048576"
        - name: PL_HOST_PATH
          value: /host
        image: index.docker.io/haroldkf/simple_binary:0.1.0-142d720
        imagePullPolicy: Always
        name: binary
        resources: {}
        securityContext:
          capabilities:
            add:
            - SYS_PTRACE
            - SYS_ADMIN
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /host
          name: host-root
          readOnly: true
        - mountPath: /sys
          name: sys
          readOnly: true
      dnsPolicy: ClusterFirstWithHostNet
      hostNetwork: true
      hostPID: true
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 10
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
      - effect: NoExecute
        operator: Exists
      - effect: NoSchedule
        operator: Exists
      volumes:
      - hostPath:
          path: /
          type: Directory
        name: host-root
      - hostPath:
          path: /sys
          type: Directory
        name: sys
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 20
    type: RollingUpdate

I've also attached the source and BUILD file of the binary. source.tar.gz

On my 3 node GKE cluster:

Currently you'll see that there is 1 pod that has 1 restart:

kubectl get pods -n simple
NAME                  READY   STATUS    RESTARTS   AGE
simple-binary-4jtxv   1/1     Running   0          7h3m
simple-binary-9wq68   1/1     Running   0          7h3m
simple-binary-xsnft   1/1     Running   1          7h3m

If we get the previous log of that pod with the restart:

kubectl logs -n simple simple-binary-xsnft -p

...
E1121 19:48:33.400875 632218 signal_action.cc:63] Caught Segmentation fault, suspect faulting address 0x249300. Trace:
**************************
PC: @     0x7f113ad70f8b  (unknown)  (unknown)
**************************
Threads: 632218
Stack trace:
PC: @          0x4ed25bf  (unknown)  threadstacks::StackTraceCollector::Collect()
    @          0x4ea157e  (unknown)  px::SignalAction::SigHandler()
    @     0x7f116000a110  (unknown)  __restore_rt
    @     0x7f113af09225  (unknown)  (unknown)
    @     0x7f113addad71  (unknown)  (unknown)
    @     0x7f113ad70f8b  (unknown)  (unknown)

Threads: 607945
Stack trace:
PC: @          0x14d93f8  (unknown)  uv_run
    @          0x14cc429  (unknown)  px::event::LibuvScheduler::Run()
    @          0x14cd269  (unknown)  px::event::LibuvDispatcher::Run()
    @          0x14c0eb2  (unknown)  binary::Binary::Run()
    @          0x14bce75  (unknown)  main
    @     0x7f115fdefe0b  (unknown)  __libc_start_main
    @          0x14bccee  (unknown)  _start

Threads: 607961
Stack trace:
PC: @          0x4edbd22  (unknown)  std::__invoke_impl<>()
    @          0x4edbc72  (unknown)  std::__invoke<>()
    @          0x4edbc35  (unknown)  std::thread::_Invoker<>::_M_invoke<>()
    @          0x4edbbe5  (unknown)  std::thread::_Invoker<>::operator()()
    @          0x4edbaae  (unknown)  std::thread::_State_impl<>::_M_run()
    @          0x526a504  (unknown)  execute_native_thread_routine
    @     0x7f115fffef27  (unknown)  start_thread
    @     0x7f115fec631f  (unknown)  __clone

Threads: 608399
Stack trace:
PC: @          0x150d992  (unknown)  px::stirling::(anonymous namespace)::SleepForDuration()
    @          0x150d375  (unknown)  px::stirling::StirlingImpl::RunCore()
    @          0x1577937  (unknown)  std::__invoke_impl<>()
    @          0x1577842  (unknown)  std::__invoke<>()
    @          0x1577805  (unknown)  std::thread::_Invoker<>::_M_invoke<>()
    @          0x15777b5  (unknown)  std::thread::_Invoker<>::operator()()
    @          0x157769e  (unknown)  std::thread::_State_impl<>::_M_run()
    @          0x526a504  (unknown)  execute_native_thread_routine
    @     0x7f115fffef27  (unknown)  start_thread
    @     0x7f115fec631f  (unknown)  __clone

Nov 22 '21 02:11 harold-kfuse

@harold-kfuse @zasgar I want to build a standalone version of Stirling to deploy it for Ubuntu based local systems. Our systems are not on Kubernetes in this case and we specifically want the HTTP/HTTP2/kafka tracing.

is it possible to do it ? And how should I build the binary ?

Jan 26 '23 17:01 amit2103