feat: add `plot/table/unicode`
Towards #2067
Description
What is the purpose of this pull request?
This pull request:
- adds
@stdlib/plot/table/unicode
Table options:
-
alignment: datum's cell alignment. Default:
'right'. -
borders: border characters. Default:
'─ │ ─ │'. -
cellPadding: cell padding. Default:
1. -
columnSeparator: column separator character. Default:
'│'. -
corners: corner characters. Default:
'┌ ┐ ┘ └'. -
headerSeparator: header separator character. Default:
'─'. -
joints: joint characters. Default:
'┼ ┬ ┤ ┴ ├'. -
marginX: horizontal output margin. Default:
0. -
marginY: vertical output margin. Default:
0. -
maxCellWidth: maximum cell width (excluding padding). Default:
FLOAT64_MAX. -
maxOutputWidth: maximum output width (including margin). Default:
FLOAT64_MAX. -
rowSeparator: row separator character. Default:
'None'.
Table methods:
- addRow(row): adds a row to data.
- getData(): gets current data and headers in an object
- render(): renders table
- setData(data,[headers]): sets data
Related Issues
Does this pull request have any related issues?
This pull request:
- subtask of #2067
Questions
Any questions for reviewers of this pull request?
The current implementation allows for loose parsing. So, if the data is not exactly tabular, in some cases, it still tries to parse, adjust and make sense of the data.
These are the only two cases where this can be seen:
data = {
'col1': [ ... ],
'col2': [ ... ]
};
headers = [ 'col1', 'col2', 'col3' ];
Parser will automatically ignore 'col3' and value of this._headers after parsing will be [ 'col1', 'col2' ].
data = [ { 'col1': 1, 'col2': 4 }, { 'col1' : 5 } ];
headers = [ 'col1', 'col2' ];
Parser will automatically fill the missing col2 value with undefined.
Should we be raising an error instead?
Other
Any other information relevant to this pull request? This may include screenshots, references, and/or implementation notes.
No.
Checklist
Please ensure the following tasks are completed before submitting this pull request.
- [x] Read, understood, and followed the contributing guidelines.
@stdlib-js/reviewers
Re: loose parsing. I'd just raise an error. If the number of headers is off, that is probably a user bug.
When an object is missing a field, that is trickier, as could mean missing data. However, again, I'd raise an error here. A user should arguably be explicitly in terms of what value should represent missing data (e.g., undefined, null, NaN, '', etc).
Re: methods. I'd opt for closer parity with sparklines. Namely,
addRow => push()
getData/setData => data (accessor)
and I would add a headers accessor for getting/setting headers. You shouldn't have to provide data again in order to update headers. This should be a separate accessor to allow independent updating.
And similar to sparklines, I'd make the table an event emitter with render and change events.
Re: maxCellWidth. I wonder if it would be better to adopt the border-box box model, as in CSS. Namely, the cell width should include padding, not exclude it.
Re: maxOutputWidth. I'd rename to simply maxWidth, as in CSS.
Re: alignment. This one is trickier. E.g., I may want to align columns differently, as I would in a spreadsheet. So my initial inclination is that alignment can either be a string (apply to all columns) or an array of strings, in which an alignment must be provided for each column.
One could argue that the same logic applies to cellPadding and maxCellWidth.
For the row separator default, wouldn't an empty string make more sense?
For the row separator default, wouldn't an empty string make more sense?
Without empty string:
┌───────┬──────┬───────┐
│ col1 │ col2 │ col3 │
├───────┼──────┼───────┤
│ 45 │ 33 │ hello │
│ 32.54 │ true │ null │
└───────┴──────┴───────┘
With empty string:
┌───────┬──────┬───────┐
│ col1 │ col2 │ col3 │
├───────┼──────┼───────┤
│ 45 │ 33 │ hello │
│ │ │ │
│ 32.54 │ true │ null │
└───────┴──────┴───────┘
Most of the prior art don't separate rows by default (dataframes in ipython or jupyter, or python tabulate), and I think it looks better too without the rows separated
Re: methods. I'd opt for closer parity with sparklines
Should I also have an argument for bufferSize that denotes the max number of rows? in the table? After that "pushing" more data would remove the "oldest" data in a cyclic manner like we do with sparklines.
Re: alignment. This one is trickier. E.g., I may want to align columns differently, as I would in a spreadsheet. So my initial inclination is that alignment can either be a string (apply to all columns) or an array of strings, in which an alignment must be provided for each column.
So, say the class isn't initialized with the data or headers, then we should raise an error if the user provides an array of alignments? Because in general, the alignments array should be of the same length as the number of columns?
and I would add a headers accessor for getting/setting headers. You shouldn't have to provide data again in order to update headers. This should be a separate accessor to allow independent updating.
Should we allow them to give headers if the data doesn't exist yet? (or raise an error)
Re: bufferSize. Yes, that can make sense for streaming contexts.
Re: alignments without data/headers. No, it just needs to be consistent. As soon as we're provided something that conveys the number of columns, from that point forward, the table has a fixed number of columns and everything just needs to be consistent.
Re: headers without data. Yes, that seems reasonable and is applicable in streaming contexts, where you know the headers beforehand and are still awaiting data to be pushed.
Re: row separator and empty string. By empty string, I did not mean create a row separator using the empty string. I meant using the empty string as a sentinel to convey that no row separator should be rendered.
Re: alignments without data/headers. No, it just needs to be consistent. As soon as we're provided something that conveys the number of columns, from that point forward, the table has a fixed number of columns and everything just needs to be consistent
Just to make sure I get you correctly, if there is no data or headers yet, and we set the alignment array for 5 columns. If the user now sets the data that depicts 9 columns, we raise an error, right?
Re: row separator and empty string. By empty string, I did not mean create a row separator using the empty string. I meant using the empty string as a sentinel to convey that no row separator should be rendered.
Ah, understood, yes that makes sense.
If the user now sets the data that depicts 9 columns, we raise an error, right?
Correct. Can raise with something like "invalid argument. Expected %d columns, but received data having %d columns.".
@kgryte Currently we have a padding prop, that adds padding to the left and right of the datum in the cell. Do we need paddingL and paddingR as separate props?
@Snehil-Shah That seems more general, so yes (?).
I'd also make it paddingLeft and paddingRight, or whatever the convention is that we used in @stdlib/plot/ctor.
The current prop name is cellPadding, just so it's not confused with marginX or marginY which don't apply to cells but the entire table.
Should we allow setting data and headers to null as a way to clear the data?
-
I noticed sparklines doesn't have the
index.d.tsfile for type declarations. Is there a reason for that and how should I proceed with this package? -
I am confused as to how to implement a
cliscript for this, givendatais a nested array and many options are also arrays..?
I noticed sparklines doesn't have the index.d.ts file for type declarations. Is there a reason for that and how should I proceed with this package?
That is an omission. No particular reason. I suggest holding off on declarations until we've finalized review and the API in order to minimize busy work involved in having to make changes in several places.
CLI
Ditto. The CLI can always be added in a follow-up PR.
Should we allow setting data and headers to null as a way to clear the data?
Don't know enough to say. What is the use case you are envisioning?
What is the use case you are envisioning?
Extending the case for streaming contexts, where the user might want to reset the data to start the next batch or something..
Should we allow setting data and headers to null as a way to clear the data?
I don't think this is necessary at this point. If you want to reset, you can just instantiate a new table.
/stdlib update-copyright-years
@Snehil-Shah Okay, I think I am more or less okay with the current state of things. Tests are not yet updated; however, I don't think we should invest time in updating those until we are okay with the implementation. Would you mind reviewing the current implementation and associated docs?
Why do we need the margins, borders, corners, joints, etc, props along with their individual props for bottom, top, right, left, etc?
It is like CSS, where you have the option of using shorthand notation for brevity or for returning an array of characters for subsequent manipulation, as compared to the more tedious var margins = [ table.marginTop, table.marginRight, ... ]. And you have the option of specifying / accessing individual properties when you only want to adjust or access a single property, as compared to the less obvious and more bug prone var marginLeft = table.margins[ 3 ] (which one is marginLeft again?). I think you want to allow ergonomic patterns for both use cases.