Fixed side effect from invocation of cleaner in unfluff.lazy
I was sure that I checked that for #16 but it seems that I missed that.
cleaner mutates original doc object so doc needs to be re-calculated. So right now after cleaner is applied we will suffer from side effect. Consider next example:
[fs, unfluff] = ['fs', 'unfluff'].map require
html = fs.readFileSync('test_tags_kexp.html', 'utf8')
doc1 = unfluff.lazy html
doc2 = unfluff.lazy html
console.log 'tags1: ', doc1.tags() # ['Dennis Morton', 'film', 'kusp film review', 'Stand Up Guys']
console.log 'text1: ', doc1.text()
console.log 'text2: ', doc2.text()
console.log 'tags2: ', doc2.tags() # [ ]
Using this code over test_tags_kexp.html fixture we will have different results for tags() since cleaner is called inside text().
So when cleaner is called we need to reload document. Besides, I added some refactoring.
Thanks for catching this! I'll take a look in detail when I have some time this weekend.
Sure. If you have ideas how we can avoid reloading document bring it up.
Sorry, I've been lax on reviewing this. Still plan to get to this very soon. Thanks!