Stefan Müller issues

Results 14 issues of


                                            Stefan Müller

Update the quanteda cheatsheet

The [quanteda cheatsheet](https://github.com/quanteda/quanteda/blob/master/tests/cheatsheet/quanteda-cheatsheet.pdf) has not been updated since 11/2018. Required changes: - [ ] Check and update printed code outputs (esp. from `corpus()` and `dfm()`) - [ ] Include **quanteda.textmodels**...

documentation

Raise error for invalid filtering using corpus_subset()/tokens_subset()/dfm_subset()

## Describe the bug When using one `=` for subsetting a corpus with `corpus_subset`, only a warning occurs. I think an error (including a comprehensible message) would be more appropriate...

bug

robustness

Add example on changing settings for corpus object

I find it a bit confusing to understand how to change the settings for `corpus()`. A clarification in `?settings` or `?corpus` would be helpful. For instance, how do I specify...

documentation

URL to file for encoding() example invalid

The man for `encoding()` lists the following example, but the file doesn't seem to be available anymore. ```r library(readtext) myreadtext Error in cache_remote(url, ignore_missing, cache, name, verbosity): Not Found (HTTP...

Attempt to solve issue #59

I tried to solve issue #59, but this code is not executing at the moment. So please don't merge yet! Maybe you create a new branch for this issue? Let...

Update quanteda cheatsheet

## Contributed Cheatsheet Information Cheatsheet Name: *quanteda* 1 sentence description of the contents: Cheatsheet for *quanteda*, an R package for managing and analyzing text Your Name (as you want to...

Instructions on how to enable parallelisation

As discussed in #2387, this PR adds information on parallelisation to README and a new article on parallelisation in **quanteda** >v4.0.0 for [quanteda.io](https://quanteda.io). @kbenoit: thanks for reviewing, editing, and rebuilding...

documentation

Add more explicit information on enabling parallelization in quanteda >v4.0.0

Parallelisation on Linux, macOS, and Windows now works, but users need to have TBB installed. Currently, the [README](https://github.com/quanteda/quanteda) lists these installation instructions under **Compile from source**, but I think we...

textmodel_svm() fails when number of documents identified to train exceeds 66,000

`textmodel_svm()` does not work when the number of documents used to train the classifier exceeds 66,000 on a MacBook Pro with 32GB RAM. ```r library(quanteda) #> Package version: 2.0.1 #>...

bug

Rescale the support vectors in textmodel_svmlin() into probabilities for multiple classes

As discussed in #23, `textmodel_svm()` fails with a large number of training documents. However, it works when using the more powerful `textmodel_svmlin()` instead. Yet, currently, `textmodel_svmlin()` does not rescale the...