dask-ml icon indicating copy to clipboard operation
dask-ml copied to clipboard

Inclusion of databases?

Open CodingLappen opened this issue 4 years ago • 0 comments

Howdy Folks,

I see that parallel processing stuff and linear regression is supported to work distributedly from an input which is a csv file. I want to execute a clustering algorithm with data which is stored in database. Yeah I know there is a huge downside to actually just downloading the data of a file, since the query operations might take time (they are simple) and the transmission from server to client even more. But is there a possibility to actually use as input from rows the databases and whilst caching data (Vectors for examples) in the working memory or/and store the important metrics of the algorithm in database table?

CodingLappen avatar Dec 22 '21 08:12 CodingLappen