danfojs icon indicating copy to clipboard operation
danfojs copied to clipboard

Provide an API that would allow 3rd party libraries to intercept `<dataframe>print()`` in order to generate Html output

Open DonJayamanne opened this issue 4 years ago • 6 comments

First, thanks a lot for this awesome package. I dabble a little in python and use pandas, and this library just makes me feel at home when using tensorflow.js in node. Awesome.

Is your feature request related to a problem?

I've built an extension for VS Code that provides rich HTML output when using danfo.js in node.js. Also supports danfo.js plots in node.js. See here https://github.com/DonJayamanne/typescript-notebook/wiki/Danfo.js#1-pretty-html-tables-for-data-frames

Currently this is made possibly by intercepting the print method. Its basically a hack.

Describe the solution you'd like

I'd like the HTML representation of dataframes & series to be generated by danfo.js instead. To make this possible I'm proposing something like a reportWriter. The end user will not have to change their code, but danfo.js will write to the custom stream istead of doing a console.log(ASII table).

Describe alternatives you've considered

Right now, I am intercepting the print method of the dataframe & series classes. This will work but i'd prefer if we had a better (officially supported API). I think the same API could then be used in the danfo notebooks as well. I.e. danfo notebooks would internall call this new API when converting data frames/series into HTML.

Here's the current implementation https://github.com/DonJayamanne/typescript-notebook/blob/main/src/extension/server/extensions/danfoFormatter.ts

Additional context

Heres' a sample of what it looks like in the VS Code notebook Output

DonJayamanne avatar Sep 03 '21 05:09 DonJayamanne

@DonJayamanne Awesome work with typescript-notebook. I started trying it out, but could not get it to work on my vscode though. I'll raise an issue with explanations in the repo.

Regarding getting DataFrames/Series as Html output, it's currently part of the work we are doing here #235 (toHtml), so if it's something you would like to work on, please let me know, so I can provide more explanation.

In terms of printing, if I understand correctly, you want the user to be able to specify the format of the output right? If yes, then as soon as toHtml function is added, we can just use it in the call to print as well.

risenW avatar Sep 03 '21 13:09 risenW

but could not get it to work on my vscode though. I'll raise an issue with explanations in the repo.

Oh no, please do let me know in that repo.

o if it's something you would like to work on, please let me know, so I can provide more explanation.

Sure thing, I can work on that.

In terms of printing, if I understand correctly, you want the user to be able to specify the format of the output right?

Actually, I'd like the user to use the same code, just use print and based on the environment it will use HTML or console output. I.e. if user is in node.repl, then it will use console.log, in browser we can use console.log or console.table and in danfonotebooks we the html fucntion can internally make use of the toHtml. Basically i'd like the API to be universal for the user, they just use print. But that's my desire, not sure you'd agree with that.

DonJayamanne avatar Sep 03 '21 14:09 DonJayamanne

but could not get it to work on my vscode though. I'll raise an issue with explanations in the repo.

Oh no, please do let me know in that repo.

o if it's something you would like to work on, please let me know, so I can provide more explanation.

Sure thing, I can work on that.

In terms of printing, if I understand correctly, you want the user to be able to specify the format of the output right?

Actually, I'd like the user to use the same code, just use print and based on the environment it will use HTML or console output. I.e. if user is in node.repl, then it will use console.log, in browser we can use console.log or console.table and in danfonotebooks we the html fucntion can internally make use of the toHtml. Basically i'd like the API to be universal for the user, they just use print. But that's my desire, not sure you'd agree with that.

In that case, this is the print function you have to update. The current print function uses the extended toString method of the class. If you look at the Series and DataFrame class, we extended the toString property here and here respectively.

So I'm assuming your print modification will be done here: and will look something like:

/**
     * Pretty prints n number of rows in a DataFrame or isSeries in the console
     * @param {rows} Number of rows to print
     */
    print() {
        if (!danfoNotebookEnv){
            console.log(this + "");
        }else{
          writeHtmlOutput(this.toHtml())
        }
    }

One question, how do you know if you're in the danfonotebook environment? 🤔

risenW avatar Sep 03 '21 14:09 risenW

and will look something like:

Perfect. was hoping for something like this (users keep using print). Will work on a PR. however I noticed you have some major refactoring going on for typescript. What branch should I base my work off?

how do you know if you're in the danfonotebook environmen

One way to do this is via some global variable or the like. Or for danfonotebook to call some method in the danfo module. require('danfo...').setxxxx(yyy) To be clear my extension doesn't support danfonotebooks.

DonJayamanne avatar Sep 03 '21 15:09 DonJayamanne

Perfect. was hoping for something like this (users keep using print). Will work on a PR. however I noticed you have some major refactoring going on for typescript. What branch should I base my work off?

Since conversion is almost done, you can branch off from danfo/typescript for now. Although we are still missing a couple of functions for DataFrame, Series is ready, so you can test.

One way to do this is via some global variable or the like. Or for danfonotebook to call some method in the danfo module. require('danfo...').setxxxx(yyy) To be clear my extension doesn't support danfonotebooks.

The easiest way might just be to set an optional parameter in config object on DataFrame/Series creation. That way we can do something like:

print() {
        if (!this.config.printOutputAsHtml){
            console.log(this + "");
        }else{
          writeHtmlOutput(this.toHtml())
        }
    }

WDYT?

  • You can add a config parameter here: https://github.com/opensource9ja/danfojs/blob/5d0647e450c138981bd8ad49a6a98b9ef682f721/danfojs-browser/src/shared/config.ts#L27
  • Here's an example of creating a DataFrame with a config option: https://github.com/opensource9ja/danfojs/blob/5d0647e450c138981bd8ad49a6a98b9ef682f721/danfojs-browser/tests/frame.test.js#L47

risenW avatar Sep 03 '21 15:09 risenW

Thanks, FYI - I'll be using somethink like this

const { registerPrinter } = require('danfojs-node');
registerPrinter((dataFrame: DataFrame) => console.log(dataFrame.toString()));
print() {
        if (!this.config.printOutputAsHtml){
            console.log(this + "");
        }else{
          writeOutput(this);
        }
    }
``

Basically remove hardcoding to always send the output as HTML to the other end.
This way the other libraries can do anything they want when `print` is called. They get the object as a callback. I could then use `toHTML` or convert to JSON and then display in multiple different formats (VS Code Notebooks supports the concept of renderers, this way users chose the output view they want, whether they want JSON/HTML or other).

& Thanks, I'll submit a PR soon.

DonJayamanne avatar Sep 07 '21 03:09 DonJayamanne