featherduster
featherduster copied to clipboard
Add automatic crib identification
It would be amazingly useful to allow people to identify common substrings in some corpus of plaintexts. We have common words in English in the frequency section of Cryptanalib already, but this is pulled from publicly available data, in contrast to our character and multigraph frequency data, which we have calculated from Charles Dickens' A Tale of Two Cities.
It should be possible to automatically recognize cribs in some provided data, and this should boil down to the Longest repeated substring problem.
Okay, yeah, it looks like this is a complete pain in the ass, actually.