Kenneth Benoit

Results 104 issues of Kenneth Benoit

This comes from https://github.com/quanteda/quanteda.sentiment/issues/11, which is a more general question about how a function can return the set of original tokens matching a dictionary lookup, not just using `tokens_select()`, but...

question
dictionary

- [ ] Add an article about extending **quanteda** - suggestions on when to load which packages - how to use Imports, how to extend generics - [ ] Update...

documentation
modularisation

Following the discussion in #1138, we are thinking of extending `convert()` so that we could go from a dfm using: ```r convert(anydfm, to = "kerasR") ``` Problem is we haven't...

question
dfm
compatibility

This looks awesome: https://rstudio.github.io/learnr/ Would be nice to integrate this with https://tutorials.quanteda.io.

documentation

Add the ability to extract parts of speech (using OpenNLP) as features, as an option to dfm. This means we should think of modularising the objects that define dfm "features"....

enhancement
tokens
design

What would be reasonable limits on what we allow a user to ask for the pattern matching functions? It appears to be an issue mainly in the number of patterns....

question
performance

This is a restart of #536, following on two use cases I've encountered in the past two days. ### Idea Provide a way for a `tokens` object to store the...

I've started a Request for Comment to serve as an ongoing discussion board, rather than a string of issues. See https://github.com/quanteda/quanteda/wiki/Proposal-for-changing-docvars. This will affect or resolve the following issues: -...

tokens
infrastructure
design
dfm
corpus

Right-to-left languages pose special challenges for **quanteda** (and R in general) for tokenising and indexing, although this may depend on locale issues that are hard to test for us (since...

enhancement
tokens

Exceptions such as Mr., Dr., Prof., etc are currently hard-wired into `tokenize.character()`. These could be listed for each language and made user accessible through `settings()`.

enhancement
tokens