stdlib icon indicating copy to clipboard operation
stdlib copied to clipboard

feat: add `plot/table/unicode`

Open Snehil-Shah opened this issue 1 year ago • 17 comments

Towards #2067

Description

What is the purpose of this pull request?

This pull request:

  • adds @stdlib/plot/table/unicode

Table options:

  • alignment: datum's cell alignment. Default: 'right'.
  • borders: border characters. Default: '─ │ ─ │'.
  • cellPadding: cell padding. Default: 1.
  • columnSeparator: column separator character. Default: '│'.
  • corners: corner characters. Default: '┌ ┐ ┘ └'.
  • headerSeparator: header separator character. Default: '─'.
  • joints: joint characters. Default: '┼ ┬ ┤ ┴ ├'.
  • marginX: horizontal output margin. Default: 0.
  • marginY: vertical output margin. Default: 0.
  • maxCellWidth: maximum cell width (excluding padding). Default: FLOAT64_MAX.
  • maxOutputWidth: maximum output width (including margin). Default: FLOAT64_MAX.
  • rowSeparator: row separator character. Default: 'None'.

Table methods:

  • addRow(row): adds a row to data.
  • getData(): gets current data and headers in an object
  • render(): renders table
  • setData(data,[headers]): sets data

Related Issues

Does this pull request have any related issues?

This pull request:

  • subtask of #2067

Questions

Any questions for reviewers of this pull request?

The current implementation allows for loose parsing. So, if the data is not exactly tabular, in some cases, it still tries to parse, adjust and make sense of the data.

These are the only two cases where this can be seen:

data = {
    'col1': [ ... ],
    'col2': [ ... ]
};
headers = [  'col1', 'col2', 'col3'  ];

Parser will automatically ignore 'col3' and value of this._headers after parsing will be [ 'col1', 'col2' ].

data = [ {  'col1': 1, 'col2': 4 }, { 'col1' : 5 } ];
headers = [  'col1', 'col2'  ];

Parser will automatically fill the missing col2 value with undefined.

Should we be raising an error instead?

Other

Any other information relevant to this pull request? This may include screenshots, references, and/or implementation notes.

No.

Checklist

Please ensure the following tasks are completed before submitting this pull request.


@stdlib-js/reviewers

Snehil-Shah avatar Jun 20 '24 00:06 Snehil-Shah

Re: loose parsing. I'd just raise an error. If the number of headers is off, that is probably a user bug.

When an object is missing a field, that is trickier, as could mean missing data. However, again, I'd raise an error here. A user should arguably be explicitly in terms of what value should represent missing data (e.g., undefined, null, NaN, '', etc).

Re: methods. I'd opt for closer parity with sparklines. Namely,

addRow => push()
getData/setData => data (accessor)

and I would add a headers accessor for getting/setting headers. You shouldn't have to provide data again in order to update headers. This should be a separate accessor to allow independent updating.

And similar to sparklines, I'd make the table an event emitter with render and change events.

Re: maxCellWidth. I wonder if it would be better to adopt the border-box box model, as in CSS. Namely, the cell width should include padding, not exclude it.

Re: maxOutputWidth. I'd rename to simply maxWidth, as in CSS.

Re: alignment. This one is trickier. E.g., I may want to align columns differently, as I would in a spreadsheet. So my initial inclination is that alignment can either be a string (apply to all columns) or an array of strings, in which an alignment must be provided for each column.

One could argue that the same logic applies to cellPadding and maxCellWidth.

kgryte avatar Jun 22 '24 05:06 kgryte

For the row separator default, wouldn't an empty string make more sense?

kgryte avatar Jun 22 '24 05:06 kgryte

For the row separator default, wouldn't an empty string make more sense?

Without empty string:

┌───────┬──────┬───────┐
│  col1 │ col2 │  col3 │
├───────┼──────┼───────┤
│    45 │   33 │ hello │
│ 32.54 │ true │  null │
└───────┴──────┴───────┘

With empty string:

┌───────┬──────┬───────┐
│  col1 │ col2 │  col3 │
├───────┼──────┼───────┤
│    45 │   33 │ hello │
│       │      │       │
│ 32.54 │ true │  null │
└───────┴──────┴───────┘

Most of the prior art don't separate rows by default (dataframes in ipython or jupyter, or python tabulate), and I think it looks better too without the rows separated

Snehil-Shah avatar Jun 22 '24 17:06 Snehil-Shah

Re: methods. I'd opt for closer parity with sparklines

Should I also have an argument for bufferSize that denotes the max number of rows? in the table? After that "pushing" more data would remove the "oldest" data in a cyclic manner like we do with sparklines.

Re: alignment. This one is trickier. E.g., I may want to align columns differently, as I would in a spreadsheet. So my initial inclination is that alignment can either be a string (apply to all columns) or an array of strings, in which an alignment must be provided for each column.

So, say the class isn't initialized with the data or headers, then we should raise an error if the user provides an array of alignments? Because in general, the alignments array should be of the same length as the number of columns?

and I would add a headers accessor for getting/setting headers. You shouldn't have to provide data again in order to update headers. This should be a separate accessor to allow independent updating.

Should we allow them to give headers if the data doesn't exist yet? (or raise an error)

Snehil-Shah avatar Jun 22 '24 17:06 Snehil-Shah

Re: bufferSize. Yes, that can make sense for streaming contexts.

Re: alignments without data/headers. No, it just needs to be consistent. As soon as we're provided something that conveys the number of columns, from that point forward, the table has a fixed number of columns and everything just needs to be consistent.

Re: headers without data. Yes, that seems reasonable and is applicable in streaming contexts, where you know the headers beforehand and are still awaiting data to be pushed.

Re: row separator and empty string. By empty string, I did not mean create a row separator using the empty string. I meant using the empty string as a sentinel to convey that no row separator should be rendered.

kgryte avatar Jun 22 '24 18:06 kgryte

Re: alignments without data/headers. No, it just needs to be consistent. As soon as we're provided something that conveys the number of columns, from that point forward, the table has a fixed number of columns and everything just needs to be consistent

Just to make sure I get you correctly, if there is no data or headers yet, and we set the alignment array for 5 columns. If the user now sets the data that depicts 9 columns, we raise an error, right?

Re: row separator and empty string. By empty string, I did not mean create a row separator using the empty string. I meant using the empty string as a sentinel to convey that no row separator should be rendered.

Ah, understood, yes that makes sense.

Snehil-Shah avatar Jun 22 '24 18:06 Snehil-Shah

If the user now sets the data that depicts 9 columns, we raise an error, right?

Correct. Can raise with something like "invalid argument. Expected %d columns, but received data having %d columns.".

kgryte avatar Jun 22 '24 18:06 kgryte

@kgryte Currently we have a padding prop, that adds padding to the left and right of the datum in the cell. Do we need paddingL and paddingR as separate props?

Snehil-Shah avatar Jul 17 '24 08:07 Snehil-Shah

@Snehil-Shah That seems more general, so yes (?).

kgryte avatar Jul 17 '24 08:07 kgryte

I'd also make it paddingLeft and paddingRight, or whatever the convention is that we used in @stdlib/plot/ctor.

kgryte avatar Jul 17 '24 08:07 kgryte

The current prop name is cellPadding, just so it's not confused with marginX or marginY which don't apply to cells but the entire table.

Snehil-Shah avatar Jul 17 '24 09:07 Snehil-Shah

Should we allow setting data and headers to null as a way to clear the data?

Snehil-Shah avatar Jul 20 '24 12:07 Snehil-Shah

  • I noticed sparklines doesn't have the index.d.ts file for type declarations. Is there a reason for that and how should I proceed with this package?

  • I am confused as to how to implement a cli script for this, given data is a nested array and many options are also arrays..?

Snehil-Shah avatar Jul 20 '24 20:07 Snehil-Shah

I noticed sparklines doesn't have the index.d.ts file for type declarations. Is there a reason for that and how should I proceed with this package?

That is an omission. No particular reason. I suggest holding off on declarations until we've finalized review and the API in order to minimize busy work involved in having to make changes in several places.

CLI

Ditto. The CLI can always be added in a follow-up PR.

kgryte avatar Jul 20 '24 20:07 kgryte

Should we allow setting data and headers to null as a way to clear the data?

Don't know enough to say. What is the use case you are envisioning?

kgryte avatar Jul 20 '24 20:07 kgryte

What is the use case you are envisioning?

Extending the case for streaming contexts, where the user might want to reset the data to start the next batch or something..

Snehil-Shah avatar Jul 20 '24 20:07 Snehil-Shah

Should we allow setting data and headers to null as a way to clear the data?

I don't think this is necessary at this point. If you want to reset, you can just instantiate a new table.

kgryte avatar Aug 01 '24 22:08 kgryte

/stdlib update-copyright-years

kgryte avatar Feb 11 '25 09:02 kgryte

@Snehil-Shah Okay, I think I am more or less okay with the current state of things. Tests are not yet updated; however, I don't think we should invest time in updating those until we are okay with the implementation. Would you mind reviewing the current implementation and associated docs?

kgryte avatar Feb 13 '25 09:02 kgryte

Why do we need the margins, borders, corners, joints, etc, props along with their individual props for bottom, top, right, left, etc?

It is like CSS, where you have the option of using shorthand notation for brevity or for returning an array of characters for subsequent manipulation, as compared to the more tedious var margins = [ table.marginTop, table.marginRight, ... ]. And you have the option of specifying / accessing individual properties when you only want to adjust or access a single property, as compared to the less obvious and more bug prone var marginLeft = table.margins[ 3 ] (which one is marginLeft again?). I think you want to allow ergonomic patterns for both use cases.

kgryte avatar Feb 17 '25 21:02 kgryte