Add support for sparse data
Hey. Do you have plans to support sparse matrices as input? The requirement for dense input rules out a lot of real-world scenarios like text classification.
I would love to add support for sparse data!
How big is your dataset? Are you able to proceed with the current package version?
I would love to add support for sparse data!
How big is your dataset? Are you able to proceed with the current package version?
Hi Piotr. No, I cannot proceed. My dataset has 800k documents converted using bag of words into 200k dimensional vectors of TF-IDF scores. On a machine with 120G of RAM it doesn't fit in memory as a dense array. Many sklearn algorithms do support sparse input. xgboost and LightGBM as well so it would be nice if your tool at least allowed sparse data for those algorithms.