Add automatic crib identification

Open unicornsasfuel opened this issue 9 years ago • 1 comments

It would be amazingly useful to allow people to identify common substrings in some corpus of plaintexts. We have common words in English in the frequency section of Cryptanalib already, but this is pulled from publicly available data, in contrast to our character and multigraph frequency data, which we have calculated from Charles Dickens' A Tale of Two Cities.

It should be possible to automatically recognize cribs in some provided data, and this should boil down to the Longest repeated substring problem.

Nov 02 '16 18:11 unicornsasfuel

Okay, yeah, it looks like this is a complete pain in the ass, actually.

Nov 02 '16 19:11 unicornsasfuel