Kenneth Benoit
Kenneth Benoit
### Error This should parse out the filepaths, not filepaths _and_ filenames. ```r > (rt3 list.files(path = paste0(DATA_DIR, "txt/movie_reviews/"), recursive = TRUE) [1] "neg/neg_cv000_29416.txt" "neg/neg_cv001_19502.txt" "neg/neg_cv002_17424.txt" [4] "neg/neg_cv003_12683.txt" "neg/neg_cv004_12641.txt" "pos/pos_cv000_29590.txt"...
Apparently `tm::readPDF()` can do this...
> R currently only emits a warning when if/while statement is used with a condition of length greater than one, e.g. > > ```r > if (c(1,2)>0) TRUE > [1]...
`text_field` is not working correctly. ```r > readtext("tests/data/xls/test3.xlsx", text_field = "other") readtext object consisting of 4 documents and 4 docvars. # data.frame [4 x 6] doc_id other colour text number...
Related to https://github.com/kbenoit/readtext/issues/13 but that did not solve it. We need to be able to import the files in Chapter 10 of _Text Analysis with R for Students of Literature_...
From [**quanteda** issue #380](https://github.com/kbenoit/quanteda/issues/380): > Apache Tika (https://tika.apache.org/) might be useful. > The KNIME folks just added that to their text mining nodes. Thanks @BobMuenchen.
Write a package to wrap around http://www.gnu.org/software/unrtf/?
Our README states: > (All ecnoding functions are handled by the stringi package.) But this is hardly true, since we use the base `iconv()` that happens through `file()` in `get-functions.R`,...