App icon indicating copy to clipboard operation
App copied to clipboard

HIGH: [Polish] [$500] Indented text does not appear correctly in the quotes

Open m-natarajan opened this issue 1 year ago • 69 comments

If you haven’t already, check out our contributing guidelines for onboarding and email [email protected] to request to join our Slack channel!


Version Number: 1.4.50-5 Reproducible in staging?: y Reproducible in production?: y If this was caught during regression testing, add the test name, ID and link from TestRail: Email or phone of affected tester (no customers): Logs: https://stackoverflow.com/c/expensify/questions/4856 Expensify/Expensify Issue URL: Issue reported by: @quinthar Slack conversation: https://expensify.slack.com/archives/C049HHMV9SM/p1710210821899179

Action Performed:

  1. open any report
  2. Send the message with indent
  • No indent - Indented 4
  1. Send the same with quotes markdown
  • No indent
    • Indented 4

Expected Result:

Should appear with indent inside the quote

Actual Result:

Appears without an indent

Workaround:

unknown

Platforms:

Which of our officially supported platforms is this issue occurring on?

  • [ ] Android: Native
  • [ ] Android: mWeb Chrome
  • [ ] iOS: Native
  • [ ] iOS: mWeb Safari
  • [x] MacOS: Chrome / Safari
  • [ ] MacOS: Desktop

Screenshots/Videos

Add any screenshot/video evidence

https://github.com/Expensify/App/assets/38435837/b2314eec-ea3c-416d-81a2-1e160dbefc36

View all open jobs on GitHub

Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~01496b08c675a55851
  • Upwork Job ID: 1767894299145392128
  • Last Price Increase: 2024-03-13
  • Automatic offers:
    • s77rt | Contributor | 0
    • tienifr | Contributor | 0
Issue OwnerCurrent Issue Owner: @laurenreidexpensify

m-natarajan avatar Mar 12 '24 17:03 m-natarajan

Triggered auto assignment to @laurenreidexpensify (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details.

melvin-bot[bot] avatar Mar 12 '24 17:03 melvin-bot[bot]

Job added to Upwork: https://www.upwork.com/jobs/~01496b08c675a55851

melvin-bot[bot] avatar Mar 13 '24 12:03 melvin-bot[bot]

Triggered auto assignment to Contributor-plus team member for initial proposal review - @mananjadhav (External)

melvin-bot[bot] avatar Mar 13 '24 12:03 melvin-bot[bot]

Proposal

Please re-state the problem that we are trying to solve in this issue.

Indented text does not appear correctly in the quotes

What is the root cause of that problem?

In the expensify-common library we use the formatTextForQuote function to format the text for quotes processing.

This function incorrectly trims each line, removing necessary spaces for proper indentation within quotes.

What changes do you think we should make in order to solve the problem?

To effectively address the issue, we'll make the following adjustments:

  • Remove spaces only between consecutive quote markers (>) to keep indentation.
  • Clean up white space-only lines at the text block's start and end.

To do that we need to:

update the formatTextForQuote to trim lines correctly based on the new logic, ensuring that spaces are preserved for indentation and between quotes or headings, and remove empty lines from the edges properly.

Note: we should trim the first space after the quote in order to address this comment

 function formatTextForQuote(regex, textToCheck, replacement) {
    if (!textToCheck.match(regex)) {
        return textToCheck;
    }

    let processedLines = [];
    let buffer = []; // Temporarily hold empty lines
    const lines = textToCheck.split("\n");
    const rgx = new RegExp(`^(\\s*>|\\s*$)`);

    lines.forEach((line) => {
        let processedRow = line.charAt(4) === " " ? line.substr(5) : line.substr(4);
        processedRow = processedRow.match(rgx) ? processedRow.trim() : processedRow;

        if (processedRow === "") {
            // Hold empty lines in buffer until a non-empty line is encountered
            buffer.push("");
        } else {
            // Once a non-empty line is encountered, dump buffer (if not at the start) and reset it
            if (processedLines.length > 0) {
                processedLines = processedLines.concat(buffer);
            }
            buffer = []; // Clear buffer after dumping
            processedLines.push(processedRow);
        }
    });

    // The buffer is not dumped at the end to avoid adding trailing empty lines
    const textToFormat = processedLines.join("\n");
    return replacement(textToFormat);
}
Same solution using array reduce
 formatTextForQuote(regex, textToCheck, replacement) {
        if (!textToCheck.match(regex)) {
            return textToCheck;
        }

        const lines = textToCheck.split("\n");
        const rgx = new RegExp(`^(\\s*>|\\s*$)`);

        // Initialize the accumulator with an object containing both the processedLines array and the buffer array
        const { processedLines } = lines.reduce(
            (acc, line) => {
                let processedRow =
                    line.charAt(4) === " " ? line.substr(5) : line.substr(4);
                processedRow = processedRow.match(rgx)
                    ? processedRow.trim()
                    : processedRow;

                if (processedRow === "") {
                    // Hold empty lines in buffer
                    acc.buffer.push("");
                } else {
                    // Dump buffer if processedLines isn't empty, then reset buffer
                    if (acc.processedLines.length > 0) {
                        acc.processedLines = acc.processedLines.concat(
                            acc.buffer
                        );
                    }
                    acc.buffer = []; // Clear buffer
                    acc.processedLines.push(processedRow); // Push the current non-empty line
                }

                return acc; // Return the updated accumulator
            },
            { processedLines: [], buffer: [] } // Initial accumulator value
        );

        // The buffer is not dumped at the end to avoid adding trailing empty lines
        const textToFormat = processedLines.join("\n");
        return replacement(textToFormat);
    }

POC:

now only 3 tests are failing so we need to fix them as well image

Alternatively:

a better way would be to hold the indices of the start and the end of the non empty lines then we slice the array as follows:

Implmenation
    formatTextForQuote(regex, textToCheck, replacement) {
        if (textToCheck.match(regex)) {
            const lines = textToCheck.split("\n");

            let startIndex = 0;
            let endIndex = lines.length - 1;
            let isStartFound = false;
            const rgx = new RegExp(`^(\\s*>|\\s*$)`);

            const processedLines = lines.map((row, index) => {
                const processedRow =
                    row.charAt(4) === " " ? row.substr(5) : row.substr(4);
                const newRow = processedRow.match(rgx)
                    ? processedRow.trim()
                    : processedRow;

                if (!isStartFound && newRow !== "") {
                    startIndex = index;
                    isStartFound = true;
                }

                if (newRow !== "") {
                    endIndex = index;
                }

                return newRow;
            });

            const relevantLines = processedLines.slice(
                startIndex,
                endIndex + 1
            );
            const textToFormat = relevantLines.join("\n");

            return replacement(textToFormat);
        }
        return textToCheck;
    }

abzokhattab avatar Mar 13 '24 13:03 abzokhattab

Proposal

Please re-state the problem that we are trying to solve in this issue.

Appears without an indent

What is the root cause of that problem?

Here, we have a logic to trim the content in each line of the quote.

However, I don't think this behavior is desirable, it's leading to bugs like this, there aren't really any reason we have to trim it. Slack and Github also doesn't trim quote content so we shouldn't.

What changes do you think we should make in order to solve the problem?

  1. Remove the trim() and update here such that if the quote is followed by a space, then we'll remove that space.
let textToFormat = textToCheck.split('\n').map(row => {
    const quoteContent = row[4] === ' ' ? row.substr(5) : row.substr(4);
    return quoteContent;
}).join('\n');

Because it's common for people to format a quote like > Hello, we should not include the there in the indentation, the quote content should be Hello only, there should be indentation if there're more than 1 space between the quote and the content. So both > Hello and >Hello will result in the same quote content of Hello

  1. We should not trim the quote content, but we should trim the spaces between the quotes (like > > as in nested quote) so that nested quote will work correctly as per this feedback

For this we will add the below condition inside the map

    if (quoteContent.trim().startsWith('>')) {
        return quoteContent.trim();
    }

This means after removing the initial quote with substr, if we encounter another quote, we'd be safe to trim the spaces between those 2 quotes.

  1. Update the trim logic here so that it will only trim redundant empty lines, we cannot use trim because trim will eliminate both line breaks and spaces, it will trim the spaces of the indentation in the first line of the quote content as well, leading to this issue.

To do this, we can make use of trimEnd and also a very small Regex to remove redundant line breaks in the start of the text.

Then use it here

textToFormat = textToFormat.trimEnd().replace(/^\s*\n/, '');

The final formatTextForQuote method will look like following, there's almost no additional complexity compared to current method.

formatTextForQuote(regex, textToCheck, replacement) {
    if (textToCheck.match(regex)) {
        let textToFormat = textToCheck.split('\n').map(row => {
            const quoteContent = row[4] === ' ' ? row.substr(5) : row.substr(4);
            if (quoteContent.trim().startsWith('>')) {
                return quoteContent.trim();
            }
            return quoteContent;
        }).join('\n');

        textToFormat = textToFormat.trimEnd().replace(/^\s*\n/, '');
        return replacement(textToFormat);
    }
    return textToCheck;
}

This passes all test cases except 2, for which we need to update the test result because we now allow indentation in the quote content.

I think we should keep the indentation (number of spaces) as the user sees it in the input, if we want to force the indentation to multiple of 4 spaces, like suggested here, it's easy to do as well.

What alternative solutions did you explore? (Optional)

If we decide that > Hello should still result in an indentation of the space , we don't need row[4] === ' ' ? row.substr(5) : row.substr(4), instead can just use row.substr(4).

Besides, there might be some further changes needed to transform HTML back to markdown, with indentation.

tienifr avatar Mar 14 '24 10:03 tienifr

Actually i am not sure if this is a bug, seems like the current behaviour is the expected result in the tests written in this PR https://github.com/Expensify/expensify-common/pull/654/files

hayes102 avatar Mar 14 '24 12:03 hayes102

📣 @hayes102! 📣 Hey, it seems we don’t have your contributor details yet! You'll only have to do this once, and this is how we'll hire you on Upwork. Please follow these steps:

  1. Make sure you've read and understood the contributing guidelines.
  2. Get the email address used to login to your Expensify account. If you don't already have an Expensify account, create one here. If you have multiple accounts (e.g. one for testing), please use your main account email.
  3. Get the link to your Upwork profile. It's necessary because we only pay via Upwork. You can access it by logging in, and then clicking on your name. It'll look like this. If you don't already have an account, sign up for one here.
  4. Copy the format below and paste it in a comment on this issue. Replace the placeholder text with your actual details. Screen Shot 2022-11-16 at 4 42 54 PM Format:
Contributor details
Your Expensify account email: <REPLACE EMAIL HERE>
Upwork Profile Link: <REPLACE LINK HERE>

melvin-bot[bot] avatar Mar 14 '24 12:03 melvin-bot[bot]

  Actually i am not sure if this is a bug, seems like the current behaviour is the expected result in the tests written in this PR https://github.com/Expensify/expensify-common/pull/654/files

I says even in the comments.

nebiyuelias1 avatar Mar 14 '24 13:03 nebiyuelias1

Actually i am not sure if this is a bug, seems like the current behaviour is the expected result in the tests written in this PR https://github.com/Expensify/expensify-common/pull/654/files

I believe it was an oversight while developing that feature. Slack, Github and other major apps preserve the indentation and we should as well.

I think @quinthar would agree with this

tienifr avatar Mar 14 '24 15:03 tienifr

Proposal

Please re-state the problem that we are trying to solve in this issue

Indented text comment inside quotes is formatted without indentation once posted.

What is the root cause of that problem?

Issue https://github.com/Expensify/App/issues/17504 and expensify-common #521 PR introduced the formatTextForQuote function which uses trim() twice when formatting each quote line within the modifyTextForQuote function.

The issue with this is that regardless of the quote line containing indent (empty spaces) + text or just indentation without any text -> the trimming always removes the indentation.

What changes do you think we should make in order to solve the problem?

[!note] From the functionality implemented by the PR mentioned in RCA we can note that the formatTextForQuote function is using trim() twice:

  1. row.substr(4).trim() to remove any empty spaces after &gt; (quote symbol) is removed.
  2. textToFormat.trim() to remove leading / trailing line breaks ("\n") after map and join('\n') method.

This means that it is paramount for us to make sure we keep the trim() for empty quote lines while we remove trimming for quote lines which are not empty, containing any of:

  • indent + text (our issue)
  • indent + text + indent
  • text

To implement this we require changes in two parts of the logic:

  1. Within the modifyTextForQuote function where we check if the line contains quote here -> we change the logic to:
if (Str.startsWith(textSplitted[i], '&gt;') && !insideCodefence) {
    // we remove "&gt;" (quote symbol) and trim
    const textContent = textSplitted[i].substring(4).trim();
    // if trimmed text content is empty return it, otherwise return original quote line
    const formattedLine = !textContent ? textContent : textSplitted[i];
    // append each quote line to textToFormat (similar to old code)
    textToFormat += `${formattedLine}\n`; 

    // textToFormat += `${textSplitted[i]}\n`; <- removed old code line
}

by doing this we keep trimming of the empty quote lines.

  1. Within the formatTextForQuote function, we only remove the first trim() instance. This ensures that for quote lines which are not empty (have text or indent + text) we keep the indentation.

[!caution] It is important that we keep the second trim() since that is not part of our issue's RCA and it was put there for a good reason: removal of leading / trailing line breaks ("\n") after map and join('\n') method.

Videos

MacOS: Chrome / Safari
Before After

ikevin127 avatar Mar 15 '24 02:03 ikevin127

Taking this as C+

s77rt avatar Mar 17 '24 16:03 s77rt

@abzokhattab Thanks for the proposal. Your RCA is not correct. Disabling the white spaces rules is not something we want here and the way it's fixing this bug is only a side effect.

s77rt avatar Mar 17 '24 17:03 s77rt

@tienifr Thanks for the proposal. Your RCA is correct. However the solution seems to have issues with nested quotes (and other failing tests) e.g.

> > > Hello my
> > beautiful
> world
Expected Received
Screenshot 2024-03-17 at 6 42 51 PM Screenshot 2024-03-17 at 6 42 58 PM

s77rt avatar Mar 17 '24 17:03 s77rt

@ikevin127 Thanks for the proposal. Your RCA is about right, though the linked PR is not the offending PR (the trim functionality was already there). I didn't go deep into analyzing your solution but it seems to have the same issue as in @tienifr's solution - bug outlined in the above comment ^

s77rt avatar Mar 17 '24 17:03 s77rt

Proposal

Updated

abzokhattab avatar Mar 18 '24 03:03 abzokhattab

Proposal updated to address @s77rt 's concern

tienifr avatar Mar 18 '24 15:03 tienifr

@mananjadhav, @laurenreidexpensify Huh... This is 4 days overdue. Who can take care of this?

melvin-bot[bot] avatar Mar 18 '24 19:03 melvin-bot[bot]

@abzokhattab Thanks for the update. The RCA is correct. Regarding the solution, the failing tests should be addressed (not all of them need to be updated - some are actually bugs) e.g. sending > hello should not have a leading space

Expected: Screenshot 2024-03-18 at 6 58 25 PM Screenshot 2024-03-18 at 7 00 17 PM

Received: Screenshot 2024-03-18 at 6 58 16 PM Screenshot 2024-03-18 at 7 00 29 PM

s77rt avatar Mar 18 '24 21:03 s77rt

@tienifr Thanks for the update. I found this edge case failing, not sure if it's worth addressing

>  # h1
> b
> c

Regardless the solution seems a little complicated (adding 3 more regex), can't we just remove the trim function and investigate why the other tests are failing?

s77rt avatar Mar 18 '24 21:03 s77rt

Just a note here: I included the heading1 in the trimming regex because on GitHub, you can have up to 4 spaces before the # and it will still be rendered as a heading, with the space being trimmed. However, if you have 5 spaces or more, it will not be rendered as a heading.

Example:

>    # hello

is rendered as:

hello

and

>     # hello

is rendered as:

 # hello

abzokhattab avatar Mar 19 '24 03:03 abzokhattab

@tienifr Thanks for the update. I found this edge case failing, not sure if it's worth addressing

@s77rt I think it not be worth since the if the user set the heading "normally" (without redundant indentation before it), it still works fine, also in some other apps like Github, the # will also not show as heading if there are many spaces before it, so the rule regarding this isn't exactly universal. But if we feel we want to address it, we can do so in the PR.

Regardless the solution seems a little complicated (adding 3 more regex), can't we just remove the trim function and investigate why the other tests are failing?

It looks like it adds more Regex but actually the Regexes are every simple to understand, we can combine them to 2 Regex (or replace Regex with normal JS but that won't be as clean), and if we look at the final composition of formatTextForQuote after the fix, it looks very clean and understandable as well.

tienifr avatar Mar 19 '24 04:03 tienifr

Proposal

Please re-state the problem that we are trying to solve in this issue.

Indented text in a quote is not showing correctly

What is the root cause of that problem?

  1. We trim leading and trailing spaces on every line in the quote. here
  2. We remove quote block start and end spaces here, while what we want to do is only remove leading and trailing line breaks.

What changes do you think we should make in order to solve the problem?

Change line

            let textToFormat = textToCheck.split('\n').map(row => row.substr(4).trim()).join('\n');

to:

            let textToFormat = textToCheck.split('\n').map(row => row.slice(row[4] == " " ? 5 : 4).replace(/\s*(&gt;|#)/, '$1').trimEnd()).join('\n');

and change line

            textToFormat = textToFormat.trim();

to

            textToFormat = textToFormat.replace(/^\n+/,'').replace(/\n+$/, '');

What alternative solutions did you explore? (Optional)

Change line

            let textToFormat = textToCheck.split('\n').map(row => row.substr(4).trim()).join('\n');

to:

            let textToFormat = textToCheck.split('\n').map(row => row.slice(4).replace(/\s*(?=&gt;|#)/, '').trimEnd()).join('\n');

and change line

            textToFormat = textToFormat.trim();

to

            textToFormat = textToFormat.replace(/^\n+|\n+$/g, '');

badeggg avatar Mar 19 '24 07:03 badeggg

A little explanation(details) about my proposal:

  1. row => row.substr(4).trim() causing leading spaces and trailing spaces of all lines in a quote be removed, which is not actually what we need, since causing the issue
  2. I believe what we really need is to trim all trailing spaces while remove as least leading spaces as possible to make the number of remaining leading spaces is multiple of 4 (let's say we want 4 spaces as an indentation, it can be 2 of course) e.g.
    14 leading spaces --> 12 leading spaces,
    13 leading spaces --> 12 leading spaces,
    2 leading spaces --> 0 leading spaces,
    5 leading spaces --> 4 leading spaces,
            row => {
                let ret = row.substr(4);  // get rid of &gt; not changed
                let trimed = ret.trimStart();   // to check how many leading spaces
                const num = (ret.length - trimed.length) % 4;  // calculate how many leading spaces should we remove
                ret = ret.slice(num).trimEnd() // remove `count number` leading spaces and all trailing spaces
                return ret;
            }

this code block is just a implementation of point 2 4. substr method is deprecated based on mdn doc, so I recommend change it to slice

badeggg avatar Mar 19 '24 07:03 badeggg

@abzokhattab Thanks for the update. Unfortunately the solution is too complex (more regex and more loops).

s77rt avatar Mar 19 '24 21:03 s77rt

@tienifr I would still prefer to avoid any additional regex. The trim was added as a feature (unwanted feature it appears). Removing that functionality should not increase the complexity.

s77rt avatar Mar 19 '24 21:03 s77rt

@badeggg Thanks for the proposal. Your RCA is correct. The solution looks good to me 👍. We don't have any specific indent size, we can just limit the leading trim to 1 char at most (substr 5 at most).

let textToFormat = textToCheck.split('\n').map(row => row.substr(row[4] == " " ? 5 : 4).trimEnd()).join('\n');

🎀 👀 🎀 C+ reviewed Link to proposal

s77rt avatar Mar 19 '24 21:03 s77rt

Triggered auto assignment to @Julesssss, see https://stackoverflow.com/c/expensify/questions/7972 for more details.

melvin-bot[bot] avatar Mar 19 '24 21:03 melvin-bot[bot]

@Julesssss Please assign me as C+ here as well

s77rt avatar Mar 19 '24 21:03 s77rt

📣 @s77rt 🎉 An offer has been automatically sent to your Upwork account for the Contributor role 🎉 Thanks for contributing to the Expensify app!

Offer link Upwork job Please accept the offer and leave a comment on the Github issue letting us know when we can expect a PR to be ready for review 🧑‍💻 Keep in mind: Code of Conduct | Contributing 📖

melvin-bot[bot] avatar Mar 20 '24 10:03 melvin-bot[bot]