xlsx files not loading
I just tried to load an xlsx file using the load function.
Evidently, since down deep this depends on the python xlrd package, this is no longer supported:
There's a disclaimer on the website
I did notice that #26 will obvious fix this.
As a workaround, you can downgrade to the last 1.x release of xlrd
using Conda
Conda.add("xlrd==1.2.0")
This could probably be pinned here - https://github.com/queryverse/ExcelReaders.jl/blob/master/src/ExcelReaders.jl#L12
If speed is your concern for large data files (as for me), you can gain a factor of 2 by using pandas via PyCall:
EDIT: There is still an error in this function, sorry
using PyCall, DataFrames
pd = pyimport("pandas")
function read_excel(f; kwargs...)
pdf = pd.read_excel(f; kwargs...)
DataFrame(Any[pdf.values[:, i] for i in 1:size(pdf.values, 2)], Symbol.(pdf.columns))
end
Forcing the openpyxl engine, as recommended by xlrd, shows again worse performance...
julia> @time DataFrame(load(f, "Tabelle1"));
0.149713 seconds (221.37 k allocations: 6.085 MiB, 12.28% gc time)
julia> @time read_excel(f);
0.077299 seconds (1.12 k allocations: 2.093 MiB)
julia> @time read_excel(f, engine = "openpyxl");
0.135302 seconds (1.13 k allocations: 2.094 MiB)