Pivot table
Pivot table support for DataFrame
Any updates on this? Seems like it would be a compelling feature to make this the default pandas go-to in the javascript ecosystem.
Any updates on this? Seems like it would be a compelling feature to make this the default pandas go-to in the javascript ecosystem.
We will work on this, just not in the roadmap atm. Unless someone decides to pick it up. Would you be interested ?
Point me to the relevant code & I'll take a look!
tbh this is currently a bit of a blocker to me even doing a spike with danfo - but if we do it, then I'd like to know how hard it'd be to implement a pivot.
Here's a gist I wrote to do a pivot on an array of melted objects with lodash: doubt it's this easy! https://gist.github.com/nite/6ffda3d61278dccfb2152f8565492009
@nite i don't think it will be that hard to implement, to implement pivot_table i think we need to implement how to access and display multi-index table.
But however, we can get started without the above
But with my little knowledge of pivot table, to create the main functionality for pivot_table without including some more complicated functionality as included in pandas
The main functionality of pivot_table from pandas API pivot_table(data, values=None, index=None, columns=None, aggfunc='mean') can be implemented as follows:
- if
indexis given, which will be a list of columns name. We need to group the DataFrame by each of the columns inindex. Hence we can have an object containing each column and their grouby dataframe e.g{col1: df.groupby(['col1']), col2: df.grouby(['col2']) } - If
valuesis not given thendf.groupby([col])for each column inindexis just like grouping the whole dataframe bycolbut ifvaluesis given, then we are grouping the DataFramecolumninvaluesbycolfromindexe.g{col1: groupby(['col1']).col(values), . . . .} - if
columnsis given that means we want to perform more than one column grouping on the DataFrame e.g{col1: groupby(['col', ...columns]), . . . .}. But I think instead of doing this at once likegrouby('col1', ...columns])we will need to loop through the columns like this:
for (I in columns){
column = columns[i]
pivotTableGraph['col1'][column] = groupby([`col1', column])
}
- if
aggfuncis given and not an array, then the operation will look like thisgrouby(['col']).mean()that's if we assumeaggfuncismean. But ifaggfuncis given like this{col1: 'mean', col2: 'sum'}then we will usegroupby(['col']).agg(aggfunc)
At the end of this operation, we would have a giant object containing the result of this operation, this object can be considered to be a graph.
To have a concrete view of the above implementation steps, you can check out pivot_table examples here: https://www.analyticsvidhya.com/blog/2020/03/pivot-table-pandas-python/ and compare them with the above implementation details.
@nite I think this is all we need to implement the main functionality of pivot_table
Cc: @risenW
Point me to the relevant code & I'll take a look!
tbh this is currently a bit of a blocker to me even doing a spike with danfo - but if we do it, then I'd like to know how hard it'd be to implement a pivot.
Here's a gist I wrote to do a pivot on an array of melted objects with lodash: doubt it's this easy! https://gist.github.com/nite/6ffda3d61278dccfb2152f8565492009
Also to add @nite You would implement this in the DataFrame class here. Your output is going to be a DataFrame, so something of this signature:
/ **
*Some doc here
* @return DataFrame
*/
pivot() {
const data = this.values //get the inner array representing the DataFrame
//your pivot code to manipulate the data
...
...
// return a new DataFrame with the pivoted values
const df = new DataFrame(pivoted_data, { columns: this.column_names, index: indx });
return df;
}
Hi everyone, are there any updates on adding pivot functionality to a dataframe? I looked at the documentation and could not find anything on this matter. It would be fantastic to have pivoting in danfo.js.