google-docs-utils icon indicating copy to clipboard operation
google-docs-utils copied to clipboard

Google Docs will now use canvas based rendering: this may impact some Chrome extensions

Open RobertJGabriel opened this issue 4 years ago • 148 comments

Just bring it up as and issue and will be willing to help on any develop to get it ready.

Here is the canvas based example https://docs.google.com/document/d/1N1XaAI4ZlCUHNWJBXJUBFjxSTlsD5XctCz6LB3Calcg/preview

@menicosia @ken107 @bboydflo @Amaimersion @JensPLarsen

RobertJGabriel avatar May 13 '21 18:05 RobertJGabriel

Thank you, don't look like there is any workaround. Will have to build an actual Google Workspace addon.

ken107 avatar May 13 '21 19:05 ken107

Maybe it is only for /preview, not for /edit? I mean, see URL.

For /preview it makes sense, because it is only preview and visitor shouldn't have ability to change its content, even through HTML. It is hard to change content in external canvas.

For me, at the moment, /edit page uses HTML editor, not canvas editor.

But yes, if there will be canvas-rendering instead of HTML-rendering, then it will be a problem.

srgykuz avatar May 13 '21 21:05 srgykuz

How you created this preview? Can you provide steps?

srgykuz avatar May 13 '21 21:05 srgykuz

The original post from Google can be found here: https://workspaceupdates.googleblog.com/2021/05/Google-Docs-Canvas-Based-Rendering-Update.html

Do they support any kind of accessibility API with the new design?

JensPLarsen avatar May 17 '21 09:05 JensPLarsen

An discussion can be found here: https://news.ycombinator.com/item?id=27129858

JensPLarsen avatar May 17 '21 13:05 JensPLarsen

Google have updated their post and opened a small possibility.

If you open "Accessibility Settings" --> "Turn on Screen reader support", Google Docs will emit Readable HTML with the actual text. Only problem is, this means a complete re-write of the core Google Docs Util code, due to the new HTML structure is different.

If possible the Google Docs Util code should:

  • detect if "Screen reader support" is turned on
  • have a option to turn "Screen reader support" on from code
  • use the new HTML structure

JensPLarsen avatar May 27 '21 12:05 JensPLarsen

Thank you @JensPLarsen

Do they support any kind of accessibility API with the new design?

I suppose no. For external JS, which didn't create the canvas, it is very hard to interact with 2D context of canvas (I mean, CRUD operations with canvas content). For example, Yandex.Disk Word editor uses canvas based rendering and for me it wasn't possible to somehow interact with document content.

will emit Readable HTML with the actual text.

The problem here is that this provides only ability to read document content. But this library need to have all CRUD operations in order to provide all already implemented functionality. Sure, I will check possibility to interact with document through that "small possibility", but highly unlikely that it will provide all needed things to support this project.

srgykuz avatar May 28 '21 09:05 srgykuz

So, I think this project will die when Google Docs will release the canvas based rendering feature. Unfortunately, at the moment it doesn't look like there is anything that can be done about it

srgykuz avatar May 28 '21 09:05 srgykuz

The problem here is that this provides only ability to read document content. But this library need to have all CRUD operations in order to provide all already implemented functionality. Sure, I will check possibility to interact with document through that "small possibility", but highly unlikely that it will provide all needed things to support this project.

So, I think this project will die when Google Docs will release the canvas based rendering feature. Unfortunately, at the moment it doesn't look like there is anything that can be done about it

I agree, if anything it would most likely result in a new project which contains a subset of what this can.

And I fear a new project may have the same issue when Google Docs changes to use WebAssembly (or something else) and everything changes again in X years.

JensPLarsen avatar May 28 '21 11:05 JensPLarsen

Are there any alternatives to this library that work with with the canvas based rendering, or are there plans to update the library?

hudson-dev avatar Oct 31 '21 04:10 hudson-dev

Are there any alternatives to this library?

No, as I'm aware.

Are there plans to update the library?

No, at the moment.

srgykuz avatar Oct 31 '21 06:10 srgykuz

Darn that sucks.

hudson-dev avatar Oct 31 '21 23:10 hudson-dev

How is Grammarly doing it then? https://chrome.google.com/webstore/detail/grammarly-for-chrome/kbfnbcaeplbcioakkpcpgfkobkghlhen?hl=en

hudson-dev avatar Oct 31 '21 23:10 hudson-dev

Google provided temporary support for such extensions. If the extension needs to interact with a document through DOM, then the extension can force Google Docs to use HTML-based rendering instead of canvas-based rendering.

It is controller via _docs_force_html_by_ext variable:

Screenshot from 2021-11-01 10-40-44

In that case Google Docs will use HTML instead of canvas.

_docs_force_html_by_ext is undefined:

Screenshot from 2021-11-01 10-36-51

_docs_force_html_by_ext is set:

Screenshot from 2021-11-01 10-38-12

However, only whitelisted extensions can use this _docs_force_html_by_ext. Most likely Google Docs team will contact with extension developer to notify him about this feature (as they did this to me).

But anyway, this feature is just a temporary workaround to give developers some time to adapt their extensions. This feature will be disabled soon, maybe in 2021, so it is not reliable.

After that we will see which extensions are able to interact with Google Docs through canvas.

srgykuz avatar Nov 01 '21 07:11 srgykuz

According to my above answer. If you want to use this library, you should install extension which enables HTML-based rendering instead of canvas-based rendering: Grammarly, Smart Copy, etc.

srgykuz avatar Nov 01 '21 07:11 srgykuz

I see, I'll try and contact Google to get whitelisted, although having to install a second extension just to use mine wouldn't be very practical for users.

hudson-dev avatar Nov 01 '21 18:11 hudson-dev

You don't actually need to be whitelisted or install any other extensions. You can force html rendering by adding ?mode=html to the query parameters.

Omegastick avatar Feb 08 '22 22:02 Omegastick

Confirm :+1: Although Google clearly specifies that HTML fallback option has been deprecated and will slowly be removed from production.

srgykuz avatar Feb 09 '22 11:02 srgykuz

Thanks! @Amaimersion Does Google mention any specific date? Where do they mention it will be removed from production?

gzomer avatar Feb 18 '22 18:02 gzomer

They mention it through email. Emails are send to those who is subscribed to https://sites.google.com/corp/google.com/docs-canvas-migration/home

They planning to remove it completely to the end of February.

srgykuz avatar Feb 18 '22 18:02 srgykuz

They planning to remove it completely to the end of February.

Sad times.

Omegastick avatar Feb 19 '22 10:02 Omegastick

Sad indeed : /

gzomer avatar Feb 19 '22 16:02 gzomer

Wait a minute: @Amaimersion Do you know how did Grammarly make it work on canvas?

They are not using the whitelisting anymore, if you inspect the DOM when Grammarly is enabled you can see it works on canvas. I have tried forcing ?mode=html and it also works. Which means Grammarly somehow managed to make it work to read the text from canvas. Now the question is, how?

Grammarly using Canvas

grammarly-canvas

Grammarly using DOM

grammarly-dom

gzomer avatar Feb 19 '22 16:02 gzomer

I have just downloaded the source code from the Grammarly extension and I found some interesting stuff there. For instance, there is getText function https://gist.github.com/gzomer/2b809174ce380fced61040005a9a9576#file-grammarly-gdocscanvasinjectedcs-js-L1060

They have a file named Grammarly-gDocsInjectedCs.js which seems to be for the DOM version. But now they have a new file named Grammarly-gDocsCanvasInjectedCs.js (see link above).

I have used this extension to get Grammarly source code https://chrome.google.com/webstore/detail/chrome-extension-source-v/jifpbeccnghkjeaalbbjmodiffmgedin?hl=en

gzomer avatar Feb 19 '22 17:02 gzomer

@gzomer this is interesting, did you happen to get it to work of the grammarly extension?

RobertJGabriel avatar Feb 20 '22 15:02 RobertJGabriel

That code is a bit complicated but if we can put a breakpoint in that getText function it will be clear

ken107 avatar Feb 20 '22 16:02 ken107

I was able to partially get the full text. On the onRender function here you can just call n.getText({}) and it will return the full text. You can also get a full document structure by inspecting the variable o.

However, there is one downside. I couldn't make it work without Grammarly extension enabled. There is a sort of a connection between docs and Grammarly const t = document.documentElement.dataset.grGdcConnId || (document.documentElement.dataset.grGdcConnId which seems to happen in another file, but I could understand how does it work.

So it seems to be possible, we just need to figure out how.

I have pushed the whole source code here: https://github.com/gzomer/grammarly-extension

So far the ones that seem to be relevant are: https://github.com/gzomer/grammarly-extension/blob/main/src/js/Grammarly-gDocsEarlyInjectedCs.js https://github.com/gzomer/grammarly-extension/blob/main/src/js/Grammarly-gDocsCanvasInjectedCs.js https://github.com/gzomer/grammarly-extension/blob/main/src/js/Grammarly-gDocs.js image

gzomer avatar Feb 20 '22 19:02 gzomer

@Omegastick I use Chrome DevTools to put a breakpoint in that content script function ce(e). That function recursively searches the properties of e to look for the document's text. The question then is where e comes from.

It turns out e is the global variable window.KX_kixApp. If you open Google Docs and press F12, then type into the console window.KX_kixApp you will see that variable.

That variable isn't accessible from the content script's context. I'm not sure how they are able to access it from their content script. The only way I know how to do something like that is by adding a script tag that will execute in the page's JavaScript context, JSON.stringify that variable and pass it to the content script via postMessage. But maybe they're doing some other way.

Edit: ah, got it. The bulk of their scripts executes in the page's JS context. The content script is Grammarly-gDocsEarlyInjector.js, which creates the script tag to inject their scripts into the page's context. I'll see if I can make a proof of concept.

Edit: and the statement on line 943 is how they search for the text. le(n, ((e,t)=>t && "" === t.toString().charAt(0)), 5) means look for string properties up to 5 levels of depth that begins with that special unicode character.

ken107 avatar Feb 22 '22 08:02 ken107

This function is self-contained:

function le(e, t, n, o=Object.getOwnPropertyNames(e)) {
        const r = new Set
          , i = [];
        let s = 0;
        const a = (o,l,c,u=0)=>{
            if (s++,
            "prototype" === o || l instanceof Window)
                return;
            if (u > n)
                return;
            const d = [...c, o];
            try {
                if (t(o, l))
                    return void i.push({
                        path: d,
                        value: l
                    })
            } catch (e) {}
            var g;
            if (null != l && !r.has(l))
                if (r.add(l),
                Array.isArray(l))
                    l.forEach(((e,t)=>{
                        try {
                            a(t.toString(), e, d, u + 1)
                        } catch (e) {}
                    }
                    ));
                else if (l instanceof Object) {
                    ((g = l) && null !== g && 1 === g.nodeType && "string" == typeof g.nodeName ? Object.getOwnPropertyNames(e).filter((e=>!J.has(e))) : Object.getOwnPropertyNames(l)).forEach((e=>{
                        try {
                            a(e, l[e], d, u + 1)
                        } catch (e) {}
                    }
                    ))
                }
        }
        ;
        return o.forEach((t=>{
            try {
                a(t, e[t], [])
            } catch (e) {}
        }
        )),
        {
            results: i,
            iterations: s
        }
    }

Calling it like this will return the text of the document:

le(window.KX_kixApp, ((e,t)=>t && "\x03" === t.toString().charAt(0)), 5)

Edit: and a de-obfuscated version https://gist.github.com/ken107/2b40c87fcdf27171a5a5fdc489639300

ken107 avatar Feb 22 '22 21:02 ken107