Dwifft Dramatically optimize algorithm in the common case by excluding match…

…ing heads and tails before using LCS. For example, in the case of single insert, the algorithm changes from O(m*n) to O(m+n). When the arrays contain 1,000 entries, for example, this change reduces the number of comparisons ~1,000,000 to ~2,000 and the size of the table used by the algorithm from ~1,000,000 to 2.

Mar 18 '18 22:03 epatey

I haven't had a chance to actually measure the perf difference, but you could replace

let matchingHeadCount = zip(lhs, rhs).prefix() { $0.0 == $0.1 }.map() { $0.0 }.count
let matchingTailCount = matchingHeadCount == minTotalCount
    ? 0 // if the matching head consumed all of either of the arrays, there's no tail
    : zip(lhs.reversed(), rhs.reversed()).prefix(minTotalCount - matchingHeadCount).prefix() { $0.0 == $0.1 }.reversed().map() { $0.0 }.count

with something like:

let (matchingHeadCount, matchingTailCount) = lhs.withUnsafeBufferPointer { unsafeLHS in
    return rhs.withUnsafeBufferPointer { unsafeRHS -> (Int, Int) in
        var mhc = minTotalCount
        for i in  0..<minTotalCount {
            if (unsafeLHS[i] != unsafeRHS[i]) {
                mhc = i
                break
            }
        }
        
        let maxPossibleTail = minTotalCount - mhc
        if (maxPossibleTail < 1) {
            return (mhc, 0)
        }
        
        var mtc = maxPossibleTail
        for i in  0..<maxPossibleTail {
            if (unsafeLHS[unsafeLHS.endIndex - i - 1] != unsafeRHS[unsafeRHS.endIndex - i - 1]) {
                mtc = i
                break
            }
        }
        
        return (mhc, mtc)
    }
}

if it makes a perf difference.

Mar 19 '18 14:03 epatey

So I did the perf test. After I added the proper .lazy's to the sequences, the perf difference between the sequence oriented approach and the unsafeBufferPointer approach is pretty small, and the sequence approach is way more readable/concise.

I still don't have an instinct around what operations drop you out of the lazy domain.

Mar 20 '18 14:03 epatey