ffbase icon indicating copy to clipboard operation
ffbase copied to clipboard

error in ffdfrbind.fill()

Open JakubKomarek opened this issue 6 years ago • 7 comments

I am trying to rbind two ffdfs objects and I follow the example from CRAN documentation. However, I always get this error:

Error in if (by < 1) stop("'by' must be > 0") : missing value where TRUE/FALSE needed In addition: Warning message: In chunk.default(from = 1L, to = 150L, by = c(logical = 46116860184273880), : NAs introduced by coercion to integer range I have also tried using ffbase2 and creating tbl.ffdf objects and then joining both dataframes by dplyr but the same error occurs.

Any advise will be appreciated.

x <- ffdfrbind.fill( as.ffdf(iris), as.ffdf(iris[, c("Sepal.Length", "Sepal.Width" , "Petal.Length")])

JakubKomarek avatar Aug 21 '19 08:08 JakubKomarek

Thanks for filing the issue:

x <- ffdfrbind.fill( as.ffdf(iris),
as.ffdf(iris[, c("Sepal.Length", "Sepal.Width"
, "Petal.Length")])

is working on the machines I tested upon (Linux and Windows).

What happens if you manually set the missing columns to NA and do an ffdfappend?

x1 <- as.ffdf(iris)
x2 <- as.ffdf(iris[, c("Sepal.Length", "Sepal.Width"
, "Petal.Length")])
x2$Petal.Width <- ff(NA, vmode = "logical", length = nrow(x2))
x2$Species <- ff(NA, vmode = "logical", length = nrow(x2))

x <- ffdfappend(x1, x2)

Still not working?

edwindj avatar Aug 22 '19 09:08 edwindj

Hi,

Thank you for your swift answer! I am using Windows and still I got:

Error in if (by < 1) stop("'by' must be > 0") : missing value where TRUE/FALSE needed In addition: Warning message: In chunk.default(from = 1L, to = 150L, by = c(logical = 46116860184273880), : NAs introduced by coercion to integer range

Best wishes,

Jakub Komárek

On Thu, 22 Aug 2019 at 11:51, Edwin de Jonge [email protected] wrote:

Thanks for filing the issue:

x <- ffdfrbind.fill( as.ffdf(iris), as.ffdf(iris[, c("Sepal.Length", "Sepal.Width" , "Petal.Length")])

is working on the machines I tested upon (Linux and Windows).

What happens if you manually set the missing columns to NA and do an ffdfappend?

x1 <- as.ffdf(iris)x2 <- as.ffdf(iris[, c("Sepal.Length", "Sepal.Width" , "Petal.Length")])x2$Petal.Width <- ff(NA, vmode = "logical", length = nrow(x2))x2$Species <- ff(NA, vmode = "logical", length = nrow(x2)) x <- ffdfappend(x1, x2)

Still not working?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/edwindj/ffbase/issues/56?email_source=notifications&email_token=AKY5E53424JXD7V2NORREB3QFZOQXA5CNFSM4IODS222YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD44RG2I#issuecomment-523834217, or mute the thread https://github.com/notifications/unsubscribe-auth/AKY5E56WLV5MT3Z7YIDNHNTQFZOQXANCNFSM4IODS22Q .

JakubKomarek avatar Aug 22 '19 17:08 JakubKomarek

Hi,

I tried the example in rstudio cloud and it worked. Do you have any idea why it does not work in my rstudio?

Thank you

Jakub

On Thu, 22 Aug 2019 at 19:13, Jakub Komárek [email protected] wrote:

Hi,

Thank you for your swift answer! I am using Windows and still I got:

Error in if (by < 1) stop("'by' must be > 0") : missing value where TRUE/FALSE needed In addition: Warning message: In chunk.default(from = 1L, to = 150L, by = c(logical = 46116860184273880), : NAs introduced by coercion to integer range

Best wishes,

Jakub Komárek

On Thu, 22 Aug 2019 at 11:51, Edwin de Jonge [email protected] wrote:

Thanks for filing the issue:

x <- ffdfrbind.fill( as.ffdf(iris), as.ffdf(iris[, c("Sepal.Length", "Sepal.Width" , "Petal.Length")])

is working on the machines I tested upon (Linux and Windows).

What happens if you manually set the missing columns to NA and do an ffdfappend?

x1 <- as.ffdf(iris)x2 <- as.ffdf(iris[, c("Sepal.Length", "Sepal.Width" , "Petal.Length")])x2$Petal.Width <- ff(NA, vmode = "logical", length = nrow(x2))x2$Species <- ff(NA, vmode = "logical", length = nrow(x2)) x <- ffdfappend(x1, x2)

Still not working?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/edwindj/ffbase/issues/56?email_source=notifications&email_token=AKY5E53424JXD7V2NORREB3QFZOQXA5CNFSM4IODS222YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD44RG2I#issuecomment-523834217, or mute the thread https://github.com/notifications/unsubscribe-auth/AKY5E56WLV5MT3Z7YIDNHNTQFZOQXANCNFSM4IODS22Q .

JakubKomarek avatar Aug 26 '19 07:08 JakubKomarek

Not at the moment: could you post the outcome of

sessionInfo()

?

Op ma 26 aug. 2019 om 09:22 schreef JakubKomarek [email protected]:

Hi,

I tried the example in rstudio cloud and it worked. Do you have any idea why it does not work in my rstudio?

Thank you

Jakub

On Thu, 22 Aug 2019 at 19:13, Jakub Komárek [email protected] wrote:

Hi,

Thank you for your swift answer! I am using Windows and still I got:

Error in if (by < 1) stop("'by' must be > 0") : missing value where TRUE/FALSE needed In addition: Warning message: In chunk.default(from = 1L, to = 150L, by = c(logical = 46116860184273880), : NAs introduced by coercion to integer range

Best wishes,

Jakub Komárek

On Thu, 22 Aug 2019 at 11:51, Edwin de Jonge [email protected] wrote:

Thanks for filing the issue:

x <- ffdfrbind.fill( as.ffdf(iris), as.ffdf(iris[, c("Sepal.Length", "Sepal.Width" , "Petal.Length")])

is working on the machines I tested upon (Linux and Windows).

What happens if you manually set the missing columns to NA and do an ffdfappend?

x1 <- as.ffdf(iris)x2 <- as.ffdf(iris[, c("Sepal.Length", "Sepal.Width" , "Petal.Length")])x2$Petal.Width <- ff(NA, vmode = "logical", length = nrow(x2))x2$Species <- ff(NA, vmode = "logical", length = nrow(x2)) x <- ffdfappend(x1, x2)

Still not working?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/edwindj/ffbase/issues/56?email_source=notifications&email_token=AKY5E53424JXD7V2NORREB3QFZOQXA5CNFSM4IODS222YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD44RG2I#issuecomment-523834217 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AKY5E56WLV5MT3Z7YIDNHNTQFZOQXANCNFSM4IODS22Q

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/edwindj/ffbase/issues/56?email_source=notifications&email_token=AAEEOHGL5XKWOASGZVL6OFLQGOACZA5CNFSM4IODS222YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5DQNYI#issuecomment-524748513, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEEOHFAXBMG3QWAL5UBHB3QGOACZANCNFSM4IODS22Q .

edwindj avatar Aug 26 '19 07:08 edwindj

R version 3.6.1 (2019-07-05) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale: [1] LC_COLLATE=Czech_Czechia.1250 LC_CTYPE=Czech_Czechia.1250 LC_MONETARY=Czech_Czechia.1250 LC_NUMERIC=C LC_TIME=Czech_Czechia.1250

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] ffbase_0.12.7 ffbase2_0.2 dplyr_0.8.3 ff_2.2-14 bit_1.1-14

loaded via a namespace (and not attached): [1] Rcpp_1.0.2 rstudioapi_0.10 magrittr_1.5 usethis_1.5.1 devtools_2.1.0 tidyselect_0.2.5 pkgload_1.0.2 R6_2.4.0 [9] rlang_0.4.0 fastmatch_1.1-0 tools_3.6.1 pkgbuild_1.0.4 sessioninfo_1.1.1 cli_1.1.0 withr_2.1.2 remotes_2.1.0 [17] lazyeval_0.2.2 assertthat_0.2.1 digest_0.6.20 rprojroot_1.3-2 tibble_2.1.3 crayon_1.3.4 processx_3.4.1 purrr_0.3.2 [25] callr_3.3.1 fs_1.3.1 ps_1.3.0 testthat_2.2.1 memoise_1.1.0 glue_1.3.1 pillar_1.4.2 compiler_3.6.1 [33] desc_1.2.0 backports_1.1.4 prettyunits_1.0.2 pkgconfig_2.0.2

On Mon, 26 Aug 2019 at 09:46, Edwin de Jonge [email protected] wrote:

Not at the moment: could you post the outcome of

sessionInfo()

?

Op ma 26 aug. 2019 om 09:22 schreef JakubKomarek <[email protected]

:

Hi,

I tried the example in rstudio cloud and it worked. Do you have any idea why it does not work in my rstudio?

Thank you

Jakub

On Thu, 22 Aug 2019 at 19:13, Jakub Komárek [email protected] wrote:

Hi,

Thank you for your swift answer! I am using Windows and still I got:

Error in if (by < 1) stop("'by' must be > 0") : missing value where TRUE/FALSE needed In addition: Warning message: In chunk.default(from = 1L, to = 150L, by = c(logical = 46116860184273880), : NAs introduced by coercion to integer range

Best wishes,

Jakub Komárek

On Thu, 22 Aug 2019 at 11:51, Edwin de Jonge <[email protected]

wrote:

Thanks for filing the issue:

x <- ffdfrbind.fill( as.ffdf(iris), as.ffdf(iris[, c("Sepal.Length", "Sepal.Width" , "Petal.Length")])

is working on the machines I tested upon (Linux and Windows).

What happens if you manually set the missing columns to NA and do an ffdfappend?

x1 <- as.ffdf(iris)x2 <- as.ffdf(iris[, c("Sepal.Length", "Sepal.Width" , "Petal.Length")])x2$Petal.Width <- ff(NA, vmode = "logical", length = nrow(x2))x2$Species <- ff(NA, vmode = "logical", length = nrow(x2)) x <- ffdfappend(x1, x2)

Still not working?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <

https://github.com/edwindj/ffbase/issues/56?email_source=notifications&email_token=AKY5E53424JXD7V2NORREB3QFZOQXA5CNFSM4IODS222YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD44RG2I#issuecomment-523834217

,

or mute the thread <

https://github.com/notifications/unsubscribe-auth/AKY5E56WLV5MT3Z7YIDNHNTQFZOQXANCNFSM4IODS22Q

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/edwindj/ffbase/issues/56?email_source=notifications&email_token=AAEEOHGL5XKWOASGZVL6OFLQGOACZA5CNFSM4IODS222YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5DQNYI#issuecomment-524748513 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AAEEOHFAXBMG3QWAL5UBHB3QGOACZANCNFSM4IODS22Q

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/edwindj/ffbase/issues/56?email_source=notifications&email_token=AKY5E5ZRQTND54ULWKB4IQ3QGOC6LA5CNFSM4IODS222YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5DSGSY#issuecomment-524755787, or mute the thread https://github.com/notifications/unsubscribe-auth/AKY5E52PYRWCVQ5RWL5GF53QGOC6LANCNFSM4IODS22Q .

JakubKomarek avatar Aug 26 '19 07:08 JakubKomarek

I just wanted to say that I am really grateful for your help!

Jakub

On Mon, 26 Aug 2019 at 09:48, Jakub Komárek [email protected] wrote:

R version 3.6.1 (2019-07-05) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale: [1] LC_COLLATE=Czech_Czechia.1250 LC_CTYPE=Czech_Czechia.1250 LC_MONETARY=Czech_Czechia.1250 LC_NUMERIC=C LC_TIME=Czech_Czechia.1250

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] ffbase_0.12.7 ffbase2_0.2 dplyr_0.8.3 ff_2.2-14 bit_1.1-14

loaded via a namespace (and not attached): [1] Rcpp_1.0.2 rstudioapi_0.10 magrittr_1.5 usethis_1.5.1 devtools_2.1.0 tidyselect_0.2.5 pkgload_1.0.2 R6_2.4.0 [9] rlang_0.4.0 fastmatch_1.1-0 tools_3.6.1 pkgbuild_1.0.4 sessioninfo_1.1.1 cli_1.1.0 withr_2.1.2 remotes_2.1.0 [17] lazyeval_0.2.2 assertthat_0.2.1 digest_0.6.20 rprojroot_1.3-2 tibble_2.1.3 crayon_1.3.4 processx_3.4.1 purrr_0.3.2 [25] callr_3.3.1 fs_1.3.1 ps_1.3.0 testthat_2.2.1 memoise_1.1.0 glue_1.3.1 pillar_1.4.2 compiler_3.6.1 [33] desc_1.2.0 backports_1.1.4 prettyunits_1.0.2 pkgconfig_2.0.2

On Mon, 26 Aug 2019 at 09:46, Edwin de Jonge [email protected] wrote:

Not at the moment: could you post the outcome of

sessionInfo()

?

Op ma 26 aug. 2019 om 09:22 schreef JakubKomarek < [email protected]>:

Hi,

I tried the example in rstudio cloud and it worked. Do you have any idea why it does not work in my rstudio?

Thank you

Jakub

On Thu, 22 Aug 2019 at 19:13, Jakub Komárek [email protected] wrote:

Hi,

Thank you for your swift answer! I am using Windows and still I got:

Error in if (by < 1) stop("'by' must be > 0") : missing value where TRUE/FALSE needed In addition: Warning message: In chunk.default(from = 1L, to = 150L, by = c(logical = 46116860184273880), : NAs introduced by coercion to integer range

Best wishes,

Jakub Komárek

On Thu, 22 Aug 2019 at 11:51, Edwin de Jonge < [email protected]> wrote:

Thanks for filing the issue:

x <- ffdfrbind.fill( as.ffdf(iris), as.ffdf(iris[, c("Sepal.Length", "Sepal.Width" , "Petal.Length")])

is working on the machines I tested upon (Linux and Windows).

What happens if you manually set the missing columns to NA and do an ffdfappend?

x1 <- as.ffdf(iris)x2 <- as.ffdf(iris[, c("Sepal.Length", "Sepal.Width" , "Petal.Length")])x2$Petal.Width <- ff(NA, vmode = "logical", length = nrow(x2))x2$Species <- ff(NA, vmode = "logical", length = nrow(x2)) x <- ffdfappend(x1, x2)

Still not working?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <

https://github.com/edwindj/ffbase/issues/56?email_source=notifications&email_token=AKY5E53424JXD7V2NORREB3QFZOQXA5CNFSM4IODS222YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD44RG2I#issuecomment-523834217

,

or mute the thread <

https://github.com/notifications/unsubscribe-auth/AKY5E56WLV5MT3Z7YIDNHNTQFZOQXANCNFSM4IODS22Q

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/edwindj/ffbase/issues/56?email_source=notifications&email_token=AAEEOHGL5XKWOASGZVL6OFLQGOACZA5CNFSM4IODS222YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5DQNYI#issuecomment-524748513 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AAEEOHFAXBMG3QWAL5UBHB3QGOACZANCNFSM4IODS22Q

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/edwindj/ffbase/issues/56?email_source=notifications&email_token=AKY5E5ZRQTND54ULWKB4IQ3QGOC6LA5CNFSM4IODS222YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5DSGSY#issuecomment-524755787, or mute the thread https://github.com/notifications/unsubscribe-auth/AKY5E52PYRWCVQ5RWL5GF53QGOC6LANCNFSM4IODS22Q .

JakubKomarek avatar Aug 26 '19 11:08 JakubKomarek

I cannot reproduce the bug on Rhub (which runs on Windows 2008 SP2), but don't despair...

Technically it is in realm of ff (and not ffbase), but I do have a hunch what the problem might be, using the error message and glaring the ff code (which is not mine).

ff uses chunking to process large vectors and data.frames. The size of a chunk is determined by the option "ffbatchbytes". It seems that on your Windows 10 machine(s) the value for the option isn't set correctly. May be because you are using 32bits R (so one option is to switch to 64bits).

ff sets this value automatically when library(ff) is called (see following code)

copied from ff:::.onLoad()

   if (is.null(getOption("ffmaxbytes"))) {
        if (.Platform$OS.type == "windows") {
            if (getRversion() >= "2.6.0") 
                options(ffmaxbytes = 0.5 * memory.limit() * (1024^2))
            else options(ffmaxbytes = 0.5 * memory.limit())
        }
        else {
            options(ffmaxbytes = 0.5 * 1024^3)
        }
    }

I suggest you set the options(ffmaxbytes) manually and try to run the examples again.

# e.g. 500MB
options(ffmaxbytes =  500 * (1024^2))

edwindj avatar Aug 28 '19 07:08 edwindj