telegraf icon indicating copy to clipboard operation
telegraf copied to clipboard

Adding support of SFlow drop packets

Open akarneliuk opened this issue 1 year ago • 12 comments

Use Case

Hey team,

I'm evaluating usage of SFlow to collect data from internet devices, where BGP information is crucial. Sflow v5 supports this information per their specification: https://sflow.org/SFLOW-STRUCTS5.txt

/* Extended Gateway Data */
/* opaque = flow_data; enterprise = 0; format = 1003 */

struct extended_gateway {
   next_hop nexthop;           /* Address of the border router that should
                                  be used for the destination network */
   unsigned int as;            /* Autonomous system number of router */
   unsigned int src_as;        /* Autonomous system number of source */
   unsigned int src_peer_as;   /* Autonomous system number of source peer */
   as_path_type dst_as_path<>; /* Autonomous system path to the destination */
   unsigned int communities<>; /* Communities associated with this route */
   unsigned int localpref;     /* LocalPref associated with this route */
}

This information is missing however in input.sflow plugin.

Expected behavior

Telegraf parses struct extended_gateway and this information is available within tags along with already parsed structs extended_router and extended_switch. SFlow plugin configuration

Actual behavior

This struct is currently not parsed and therefore information isn't available.

Additional info

No response

akarneliuk avatar May 17 '24 20:05 akarneliuk

Hi,

Can you please look at using the inputs.netflow plugin instead. It does look to already support the extended gateway.

Thanks

powersj avatar May 17 '24 20:05 powersj

Hey @powersj ,

Thanks for prompt response. I did look into that plug-in, but sadly I was getting errors for every packet I was receiving; hence, I switched to sflow plug-in, which worked quite nicely apart from missing this struct.

What is the long-term goal in InfluxData? are you supporting both plug-ins or going to sunset one in a favour of another?

Thanks, Anton

akarneliuk avatar May 17 '24 20:05 akarneliuk

sflow is deprecated in favor of netflow. If you are getting errors please do let us know and we can take a look.

powersj avatar May 17 '24 20:05 powersj

Hey @powersj ,

understood. So, here is my issue with netflow plug-in. Telegraf version: 1.30.2 Configuration

[[inputs.netflow]]
  service_address = "udp://:6343"
  protocol = "sflow v5"

When I send sFlow data to Telegraf, I see the following errors in the Telegarf log:

2024-05-18T09:08:13Z E! [inputs.netflow] Error in plugin: sFlow sample [[format:5 seq: 71837279] unknown format 5]; raw data ...

Thanks, Anton

akarneliuk avatar May 18 '24 09:05 akarneliuk

format:5

Looking at the upstream goflow2 code, format 5 is not defined:

FORMAT_RAW_PKT = 1 FORMAT_ETH = 2 FORMAT_IPV4 = 3 FORMAT_IPV6 = 4

From slofw.go. Is this some sort of extended data? @srebhan thoughts?

powersj avatar May 20 '24 13:05 powersj

@powersj will look into it...

@akarneliuk could you please post the data after the raw data part of the log message so I can reproduce the issue locally!?

srebhan avatar May 21 '24 09:05 srebhan

Hey @srebhan ,

Thanks for looking into that one. I'm looking how I can anonymise the packet for compliance reasons. What I was able to detect by digging into pcap with sflowtool is that the packets causing problems are drop notifications: https://sflow.org/sflow_drops.txt

Which I believe raises an interesting difference in behaviour between sflow plugins and netflow plguins in Telegraf: sflow plugin ignnores things it cannot decode and decode the rest. The netflow throws an error. Perhaps, the later is more preferable (it would be nice though to have possibility (like flag or so ) to ignore problematic issues.

Going back to original issue, do you think you can look into implementing drop notifications? Also if you can DM me your mail so I can share anonymised pcap.

Best, Anton

akarneliuk avatar May 21 '24 18:05 akarneliuk

@akarneliuk it would be nice if you could send me an anonymized dump of such a packet in this issue so I can create an unit-test from it. Alternatively, you can drop the non-redacted dump into a personal message on Slack (@ Sven Rebhan)...

srebhan avatar May 22 '24 07:05 srebhan

@akarneliuk please test the binary in PR #15396, available as soon as CI finished tests, and let me know if this works for you!

srebhan avatar May 23 '24 14:05 srebhan

Hey @srebhan, I have tested, it worked nicely for me! Thank you so much for doing that quickly. I have another request for sflow as well, but i will open separate issue for that

akarneliuk avatar May 28 '24 11:05 akarneliuk

@akarneliuk please be aware that merging my required changes upstream (into the goflow2 library) might take some time, so do not expect this feature to land in v1.31.0 yet!

srebhan avatar May 31 '24 08:05 srebhan

@akarneliuk rebase the PR against the latest master to include additional fields for extended-gateway etc...

srebhan avatar Jun 25 '24 19:06 srebhan

@akarneliuk please test the binary in PR #15396 again after CI finished the build. Upstream has merged the required PRs and I would like to make sure that everything still works as intended.

srebhan avatar Jul 18 '24 14:07 srebhan