Docx: copy/paste adds extra paragraphs when pasted from google doc
Description If I add line breaks in a google doc, the deserializer adds an extra paragraph for each break.
What it looks like in google doc (one paragraph between each line):

What it looks like pasted into Plate editor (two paragraphs between each line):

This is the html returned from e.clipboardData.getData('text/html')
<meta charset='utf-8'><meta charset="utf-8"><b style="font-weight:normal;" id="docs-internal-guid-0753e24d-7fff-d209-84cc-3361f30177bf"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Hello world</span></p><br /><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Hello World</span></p><br /><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Hello World</span></p><br /><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">Hello World</span></p></b><br class="Apple-interchange-newline">
This is the slate state after pasting:
[{"type":"p","children":[{"text":"Hello world","fontSize":"11pt","backgroundColor":"transparent","color":"rgb(0, 0, 0)"}],"id":1649266297118},{"type":"p","children":[{"text":"\n"}],"id":1649266297118},{"type":"p","children":[{"text":"Hello World","fontSize":"11pt","backgroundColor":"transparent","color":"rgb(0, 0, 0)"}],"id":1649266297118},{"type":"p","children":[{"text":"\n"}],"id":1649266297118},{"type":"p","children":[{"text":"Hello World","fontSize":"11pt","backgroundColor":"transparent","color":"rgb(0, 0, 0)"}],"id":1649266297119},{"type":"p","children":[{"text":"\n"}],"id":1649266297119},{"type":"p","children":[{"text":"Hello World","fontSize":"11pt","backgroundColor":"transparent","color":"rgb(0, 0, 0)"}],"id":1649266297119},{"type":"p","children":[{"text":"\n"}],"id":1649266297119}]
Steps
- create some content in a google doc with line breaks.
- copy/paste into a plate editor with all deserialize plugins enabled
Sandbox
https://codesandbox.io/s/plate-playground-v1-2mh1c
Expectation Line breaks should be interpreted by deserializer as just that -- line break.
Environment
- slate: 0.75.0
- slate-react: 0.75.0
- browser: chrome
Funding
- You can sponsor this specific effort via a Polar.sh pledge below
- We receive the pledge once the issue is completed & verified
Interesting, I hadn't thought of using the docx deserializer for paste from Google Docs and instead use the html deserializer.
Docx deserializer is tightly coupled to Word, we'll probably need a similar strategy for Google Docs.
Oh I just assumed it was docx deserializer. I'm using all the deserializers together like the sandbox.
I find Google Docs mostly works well when pasting with the html deserializer. It does matter a bit how you handle line breaks for both import and export and I need to spend some time trying to preserve what was in the original document. We could extend the HTML deserializer for Google Docs if there's a significant amount of modifications to make.