Dataframe class name
I think there is consensus (correct me if I'm wrong), on having a 2-D structure where (at least) columns are labelled, and where a whole column share a type. More specific discussions about this structure can be made in #2.
In this issue, I'd like to discuss how we should name the class representing this structure. We've been using dataframe for the concept so far, and it's how the class is named in pandas, vaex, Modin, R and others. But in https://github.com/pydata-apis/dataframe-api/issues/14#issuecomment-644234095 it was proposed that we consider other names. I list here the proposed options in the comment and couple more. I propose that people write their username next to their preferred option, and use the comments to expand on why if needed.
- DataFrame
- @datapythonista
- @devin-petersohn
- @TomAugspurger
- @maartenbreddels
- Frame
- Table
- Grid
- Dataset
- DataGrid
- DataMatrix
- DataSheet
- DataPage
- RowCol
- Screen
- Slate
- Panel
- Lattice
- Board
- DataBoard
Also, I think we should decide about capitalization, I guess these are the only options (using dataframe as example, but applied to the preferred option from the above list):
- DataFrame (Class capitalization, 2 words)
- @TomAugspurger
- @maartenbreddels
- Dataframe (Class capitalization, single word)
- dataframe (Type capitalization, like in
int,datetime.datetime,numpy.array)- @datapythonista
- @devin-petersohn
I personally think that in Python there is some consistency in using lowercase capitalization for types and data structures: int, str, list, datetime, dict, tuple, array,... and it feels like dataframe belongs to that group, more than a general class.
But if @devin-petersohn is ok with DataFrame, and there are no more opinions, I'll open a PR in the RFC for DataFrame (since it's the one being used so far).
I am only okay with DataFrame if Array (capitalization) is how we are spelling the array protocol. As long as we are consistent between the two, I am okay.
+1 to matching what the array / Array API standard does.
On Thu, Jun 25, 2020 at 4:01 PM Devin Petersohn [email protected] wrote:
I am only okay with DataFrame if Array (capitalization) is how we are spelling the array protocol. As long as we are consistent between the two, I am okay.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata-apis/dataframe-api/issues/17#issuecomment-649815116, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOIX4Z62EAHYWX2DEYF3RYO3J5ANCNFSM4OFQR62Q .
+1 to matching what the array / Array API standard does.
we didn't specify a name there - there's just "an array object". Reason: not needed (there's no way to call <array>.__init__), and existing libraries will not rename their existing array, Array, NDArray, Tensor` objects anyway.
Status quo is DataFrame.