flower icon indicating copy to clipboard operation
flower copied to clipboard

Replace pynids with something better

Open nicomazz opened this issue 5 years ago • 4 comments

nicomazz avatar Sep 06 '20 10:09 nicomazz

Hey, I'm interesting replacing pynids. May I know why do you need pynids? What pynids has but scapy doesn't have?

cothan avatar Nov 07 '20 05:11 cothan

Hi cothan! Thanks for the interest. If I remember well, when flower was written there wasn't a nice API to reassembly TCP streams with scrapy, while pynids provided exactly what we needed in only a few lines of python. From a fast search, it seems it is now possible. Your PR is highly welcomed! Feel free to write me if you need any help.

nicomazz avatar Nov 07 '20 12:11 nicomazz

Hi @nicomazz ,

Similar question here: https://stackoverflow.com/questions/2259458/how-to-reassemble-tcp-segment

I have some reasons that it is can not easily be done in in Python, or at least need to use Python Binding wrapper. Some promising library I found that support TCP reassembly is:

  1. https://github.com/seladb/PcapPlusPlus
  2. https://github.com/mfontanini/libtins

So basically libpcap, libint, and pcap++ are good. https://pcapplusplus.github.io/docs/benchmark

All of the solution seem a bit complicated, from my little understanding of the code base, a simple tcpflow -r capture.pcap will do the job, the Python code will do the parsing data output from tcpflow. By looking at some example from: https://github.com/simsong/tcpflow The file name is: 128.129.130.131.02345-010.011.012.013.45103 We use Python to parse and read IP, port, high level step as follow:

  1. Read report.xml output from tcpflow
  2. Parse report.xml base on tcpflow startime, reorder packet base on port, IP and time. This approach will keep the conversation between IP1:port1 <=> IP2:port2 display in the same box.
  3. Feed to database.

Do you think so ? This will require writing to disk, but straight forward to parse and do TCP reassembly.

cothan avatar Nov 08 '20 03:11 cothan

Tbh I'm not a great fan of tcpflow for several reasons:

  1. first, tests in the repo are not passing. Then, compared to the other, the codebase seems a bit neglected.
  2. Passing through the disk seems adding a lot of useless overhead: during a real CTF there is an incredible number of packets (potentially more than 100GB), and doing a filesystem IO operation for each of them is too expensive, both for the time needed and for the lifespan of your SSD maybe.

The two options you've proposed seem great! Both support ipv6, that is a requirement of the new replacement, and they have ready to use examples:

  1. PcapPlusPlus has an example that generates the output in the tcpflow format. Maybe you can change that to avoid writing a file to the filesystem? The great thing here is that both ip v4 and v6 are supported
  2. Also libtins has already a pretty nice interface for what we need. It is basically the same as what we are using now in python.

Let me know if you need any help!

nicomazz avatar Nov 08 '20 10:11 nicomazz