Manuel Bickel

Results 13 comments of Manuel Bickel

We might add some aspects regarding downstream analysis (and maybe visualization depending on the target audience or format of publication). Regarding downstream analysis we might do (feel free to change/adapt/add):...

Thank you for your reply. So I was halfway on the right track by introducing dashes into the ngrams of the dictionary. Something like `cc_model$collocation_stat

That`s fine, I guess you have some more important/complex problems to solve than some dictionary lookups. For the time being I think I can use my workaround, but as soon...

Just realized that I had forgotten to insert the helper function that finds trailing ngrams into the code, sorry for that. I have updated my last code comment accordingly so...

This is an update on this issue, however, not a solution, yet. As per your first comment in this thread, I have created a `collocation_stat` from a cc_dictionary (here only...

As a side note / hint to spell checking: just stumbled over the [ropensci/hunspell](https://github.com/ropensci/hunspell) package. Have not digged into the details of the implementation, but the basic idea is that...

Thank you for your question. As in all tasks regarding the selection of the right number of clusters, topics, etc. there is no single correct answer. Each selection criterion has...

With respect to the Jensen Shannon divergence I think that the fix proposed by [Maren-Eckhoff](https://github.com/maren-eckhoff) and pending as [open pull request](https://github.com/cpsievert/LDAvis/pull/77) already solves the problem. See adapted function and test...

Maybe my comment was misleading, sorry. I agree that LDAvis will have to be reimplemented, just wanted to confirm that the fix works for this purpose. Hence, in the first...

I have not worked with short texts. Therefore, I have no good sources at hand, unfortunately. Maybe Japanese Haiku to make Text Mining more philosophical ;-)? Side Note: sorry for...