Sandy4321

Results 306 comments of Sandy4321

great news thank you so much one hot data is the sparse data with ( values only 0 or 1) asked example is https://scikit-learn.org/stable/auto_examples/text/plot_document_classification_20newsgroups.html only you do need to use...

Cast to what to binary type Only zero and ones One bit values? My guess for code as it is for now It can not be done... As you wrote...

Great Then let's do at least byte size data? And of cause sparse format data Huge ram saving!

some ideas you can try 8 bits number https://arxiv.org/abs/2208.07339 https://huggingface.co/blog/hf-bitsandbytes-integration

https://github.com/solegalli better to do this implicitly than you know exactly what is happen catbost has option for combining but then it become slow and also we do not know what...

For example scikit learn polynomial features created for continuous variables, then we need to do the same for categorical

https://github.com/pierrepita/categorical-data-generator

https://github.com/GLevV full code needed for example you have data frame - mydf and you need to create new data frame newdf with all possible combinations of categorical features