Queryverse.jl icon indicating copy to clipboard operation
Queryverse.jl copied to clipboard

loading small data takes a long time.

Open davidbp opened this issue 7 years ago • 1 comments

Hello,

I tried to load a small dataset and it seems to take a lot of time to do so. It also seems that using Queryverse takes a significant amount of time.

@time using Queryverse
 17.643986 seconds (43.33 M allocations: 2.193 GiB, 6.44% gc time)
@time df = DataFrame(load("iris.csv"))
17.341491 seconds (55.15 M allocations: 2.607 GiB, 11.01% gc time)

The second time I load data is much faster though.

@time df = DataFrame(load("iris.csv"))
 0.001057 seconds (5.01 k allocations: 210.609 KiB)

Is there a command to precompile or another way to make this faster?

davidbp avatar Oct 21 '18 20:10 davidbp

I'm afraid this is just a general problem right now with julia: precompile doesn't actually save machine code, so even with precompile, a lot of stuff needs to be recompiled in every new julia session...

I think there are only two options that could work right now: 1) you could try to compile these packages into your sysimage. I've never done that and it might be very complicated... 2) Load just using CSVFiles, DataFrames. That will cut down on the number of packages that are being loaded, so it should help with the time the using takes. It won't help with the second issue...

davidanthoff avatar Oct 28 '18 01:10 davidanthoff