csvtools icon indicating copy to clipboard operation
csvtools copied to clipboard

Sort huge CSV by 7th column

Open baerbock opened this issue 7 years ago • 1 comments

I would like to sort a 300 MB CSV with a header (https://data.open-power-system-data.org/renewable_power_plants/2018-03-08/renewable_power_plants_DE.csv) by it's 7th column electrical_capacity which contains numerical values:

0.075
0.02937
0.4
0.303

How could I do this with csvtools? Thank you very much for any guidance.

baerbock avatar Jan 08 '19 01:01 baerbock

There is no build in tool for sorting yet, primarily because it requires to keep all the output somewhere in memory (or on disk), since the last line might be the top line in the output.

I guess I would say, import it into a database (sqlite comes to mind), and have fun there?

Otherwise, you could use csvawk to rewrite the order of the columns, such that the 7th column is the first column, and then pipe that through the regular gnutools sort. If you want you could then again pipe it through csvawk to reshuffle the columns.

DavyLandman avatar Jan 08 '19 08:01 DavyLandman