Replace pynids with something better
Hey, I'm interesting replacing pynids. May I know why do you need pynids? What pynids has but scapy doesn't have?
Hi cothan! Thanks for the interest. If I remember well, when flower was written there wasn't a nice API to reassembly TCP streams with scrapy, while pynids provided exactly what we needed in only a few lines of python. From a fast search, it seems it is now possible. Your PR is highly welcomed! Feel free to write me if you need any help.
Hi @nicomazz ,
Similar question here: https://stackoverflow.com/questions/2259458/how-to-reassemble-tcp-segment
I have some reasons that it is can not easily be done in in Python, or at least need to use Python Binding wrapper. Some promising library I found that support TCP reassembly is:
- https://github.com/seladb/PcapPlusPlus
- https://github.com/mfontanini/libtins
So basically libpcap, libint, and pcap++ are good. https://pcapplusplus.github.io/docs/benchmark
All of the solution seem a bit complicated, from my little understanding of the code base, a simple tcpflow -r capture.pcap will do the job, the Python code will do the parsing data output from tcpflow.
By looking at some example from: https://github.com/simsong/tcpflow
The file name is: 128.129.130.131.02345-010.011.012.013.45103
We use Python to parse and read IP, port, high level step as follow:
- Read
report.xmloutput fromtcpflow - Parse
report.xmlbase ontcpflow startime, reorder packet base on port, IP and time. This approach will keep the conversation betweenIP1:port1 <=> IP2:port2display in the same box. - Feed to database.
Do you think so ? This will require writing to disk, but straight forward to parse and do TCP reassembly.
Tbh I'm not a great fan of tcpflow for several reasons:
- first, tests in the repo are not passing. Then, compared to the other, the codebase seems a bit neglected.
- Passing through the disk seems adding a lot of useless overhead: during a real CTF there is an incredible number of packets (potentially more than 100GB), and doing a filesystem IO operation for each of them is too expensive, both for the time needed and for the lifespan of your SSD maybe.
The two options you've proposed seem great! Both support ipv6, that is a requirement of the new replacement, and they have ready to use examples:
-
PcapPlusPlushas an example that generates the output in thetcpflowformat. Maybe you can change that to avoid writing a file to the filesystem? The great thing here is that both ip v4 and v6 are supported - Also
libtinshas already a pretty nice interface for what we need. It is basically the same as what we are using now in python.
Let me know if you need any help!