HIGH: [Polish] [$500] Indented text does not appear correctly in the quotes
If you haven’t already, check out our contributing guidelines for onboarding and email [email protected] to request to join our Slack channel!
Version Number: 1.4.50-5 Reproducible in staging?: y Reproducible in production?: y If this was caught during regression testing, add the test name, ID and link from TestRail: Email or phone of affected tester (no customers): Logs: https://stackoverflow.com/c/expensify/questions/4856 Expensify/Expensify Issue URL: Issue reported by: @quinthar Slack conversation: https://expensify.slack.com/archives/C049HHMV9SM/p1710210821899179
Action Performed:
- open any report
- Send the message with indent
- No indent - Indented 4
- Send the same with quotes markdown
- No indent
- Indented 4
Expected Result:
Should appear with indent inside the quote
Actual Result:
Appears without an indent
Workaround:
unknown
Platforms:
Which of our officially supported platforms is this issue occurring on?
- [ ] Android: Native
- [ ] Android: mWeb Chrome
- [ ] iOS: Native
- [ ] iOS: mWeb Safari
- [x] MacOS: Chrome / Safari
- [ ] MacOS: Desktop
Screenshots/Videos
Add any screenshot/video evidence
https://github.com/Expensify/App/assets/38435837/b2314eec-ea3c-416d-81a2-1e160dbefc36
Upwork Automation - Do Not Edit
- Upwork Job URL: https://www.upwork.com/jobs/~01496b08c675a55851
- Upwork Job ID: 1767894299145392128
- Last Price Increase: 2024-03-13
- Automatic offers:
- s77rt | Contributor | 0
- tienifr | Contributor | 0
Issue Owner
Current Issue Owner: @laurenreidexpensify
Triggered auto assignment to @laurenreidexpensify (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details.
Job added to Upwork: https://www.upwork.com/jobs/~01496b08c675a55851
Triggered auto assignment to Contributor-plus team member for initial proposal review - @mananjadhav (External)
Proposal
Please re-state the problem that we are trying to solve in this issue.
Indented text does not appear correctly in the quotes
What is the root cause of that problem?
In the expensify-common library we use the formatTextForQuote function to format the text for quotes processing.
This function incorrectly trims each line, removing necessary spaces for proper indentation within quotes.
What changes do you think we should make in order to solve the problem?
To effectively address the issue, we'll make the following adjustments:
- Remove spaces only between consecutive quote markers (>) to keep indentation.
- Clean up white space-only lines at the text block's start and end.
To do that we need to:
update the formatTextForQuote to trim lines correctly based on the new logic, ensuring that spaces are preserved for indentation and between quotes or headings, and remove empty lines from the edges properly.
Note: we should trim the first space after the quote in order to address this comment
function formatTextForQuote(regex, textToCheck, replacement) {
if (!textToCheck.match(regex)) {
return textToCheck;
}
let processedLines = [];
let buffer = []; // Temporarily hold empty lines
const lines = textToCheck.split("\n");
const rgx = new RegExp(`^(\\s*>|\\s*$)`);
lines.forEach((line) => {
let processedRow = line.charAt(4) === " " ? line.substr(5) : line.substr(4);
processedRow = processedRow.match(rgx) ? processedRow.trim() : processedRow;
if (processedRow === "") {
// Hold empty lines in buffer until a non-empty line is encountered
buffer.push("");
} else {
// Once a non-empty line is encountered, dump buffer (if not at the start) and reset it
if (processedLines.length > 0) {
processedLines = processedLines.concat(buffer);
}
buffer = []; // Clear buffer after dumping
processedLines.push(processedRow);
}
});
// The buffer is not dumped at the end to avoid adding trailing empty lines
const textToFormat = processedLines.join("\n");
return replacement(textToFormat);
}
Same solution using array reduce
formatTextForQuote(regex, textToCheck, replacement) {
if (!textToCheck.match(regex)) {
return textToCheck;
}
const lines = textToCheck.split("\n");
const rgx = new RegExp(`^(\\s*>|\\s*$)`);
// Initialize the accumulator with an object containing both the processedLines array and the buffer array
const { processedLines } = lines.reduce(
(acc, line) => {
let processedRow =
line.charAt(4) === " " ? line.substr(5) : line.substr(4);
processedRow = processedRow.match(rgx)
? processedRow.trim()
: processedRow;
if (processedRow === "") {
// Hold empty lines in buffer
acc.buffer.push("");
} else {
// Dump buffer if processedLines isn't empty, then reset buffer
if (acc.processedLines.length > 0) {
acc.processedLines = acc.processedLines.concat(
acc.buffer
);
}
acc.buffer = []; // Clear buffer
acc.processedLines.push(processedRow); // Push the current non-empty line
}
return acc; // Return the updated accumulator
},
{ processedLines: [], buffer: [] } // Initial accumulator value
);
// The buffer is not dumped at the end to avoid adding trailing empty lines
const textToFormat = processedLines.join("\n");
return replacement(textToFormat);
}
POC:
now only 3 tests are failing so we need to fix them as well
Alternatively:
a better way would be to hold the indices of the start and the end of the non empty lines then we slice the array as follows:
Implmenation
formatTextForQuote(regex, textToCheck, replacement) {
if (textToCheck.match(regex)) {
const lines = textToCheck.split("\n");
let startIndex = 0;
let endIndex = lines.length - 1;
let isStartFound = false;
const rgx = new RegExp(`^(\\s*>|\\s*$)`);
const processedLines = lines.map((row, index) => {
const processedRow =
row.charAt(4) === " " ? row.substr(5) : row.substr(4);
const newRow = processedRow.match(rgx)
? processedRow.trim()
: processedRow;
if (!isStartFound && newRow !== "") {
startIndex = index;
isStartFound = true;
}
if (newRow !== "") {
endIndex = index;
}
return newRow;
});
const relevantLines = processedLines.slice(
startIndex,
endIndex + 1
);
const textToFormat = relevantLines.join("\n");
return replacement(textToFormat);
}
return textToCheck;
}
Proposal
Please re-state the problem that we are trying to solve in this issue.
Appears without an indent
What is the root cause of that problem?
Here, we have a logic to trim the content in each line of the quote.
However, I don't think this behavior is desirable, it's leading to bugs like this, there aren't really any reason we have to trim it. Slack and Github also doesn't trim quote content so we shouldn't.
What changes do you think we should make in order to solve the problem?
- Remove the
trim()and update here such that if the quote is followed by a space, then we'll remove that space.
let textToFormat = textToCheck.split('\n').map(row => {
const quoteContent = row[4] === ' ' ? row.substr(5) : row.substr(4);
return quoteContent;
}).join('\n');
Because it's common for people to format a quote like > Hello, we should not include the there in the indentation, the quote content should be Hello only, there should be indentation if there're more than 1 space between the quote and the content.
So both > Hello and >Hello will result in the same quote content of Hello
- We should not trim the quote content, but we should trim the spaces between the quotes (like
> >as in nested quote) so that nested quote will work correctly as per this feedback
For this we will add the below condition inside the map
if (quoteContent.trim().startsWith('>')) {
return quoteContent.trim();
}
This means after removing the initial quote with substr, if we encounter another quote, we'd be safe to trim the spaces between those 2 quotes.
- Update the trim logic here so that it will only trim redundant empty lines, we cannot use
trimbecausetrimwill eliminate both line breaks and spaces, it will trim the spaces of the indentation in the first line of the quote content as well, leading to this issue.
To do this, we can make use of trimEnd and also a very small Regex to remove redundant line breaks in the start of the text.
Then use it here
textToFormat = textToFormat.trimEnd().replace(/^\s*\n/, '');
The final formatTextForQuote method will look like following, there's almost no additional complexity compared to current method.
formatTextForQuote(regex, textToCheck, replacement) {
if (textToCheck.match(regex)) {
let textToFormat = textToCheck.split('\n').map(row => {
const quoteContent = row[4] === ' ' ? row.substr(5) : row.substr(4);
if (quoteContent.trim().startsWith('>')) {
return quoteContent.trim();
}
return quoteContent;
}).join('\n');
textToFormat = textToFormat.trimEnd().replace(/^\s*\n/, '');
return replacement(textToFormat);
}
return textToCheck;
}
This passes all test cases except 2, for which we need to update the test result because we now allow indentation in the quote content.
I think we should keep the indentation (number of spaces) as the user sees it in the input, if we want to force the indentation to multiple of 4 spaces, like suggested here, it's easy to do as well.
What alternative solutions did you explore? (Optional)
If we decide that > Hello should still result in an indentation of the space , we don't need row[4] === ' ' ? row.substr(5) : row.substr(4), instead can just use row.substr(4).
Besides, there might be some further changes needed to transform HTML back to markdown, with indentation.
Actually i am not sure if this is a bug, seems like the current behaviour is the expected result in the tests written in this PR https://github.com/Expensify/expensify-common/pull/654/files
📣 @hayes102! 📣 Hey, it seems we don’t have your contributor details yet! You'll only have to do this once, and this is how we'll hire you on Upwork. Please follow these steps:
- Make sure you've read and understood the contributing guidelines.
- Get the email address used to login to your Expensify account. If you don't already have an Expensify account, create one here. If you have multiple accounts (e.g. one for testing), please use your main account email.
- Get the link to your Upwork profile. It's necessary because we only pay via Upwork. You can access it by logging in, and then clicking on your name. It'll look like this. If you don't already have an account, sign up for one here.
- Copy the format below and paste it in a comment on this issue. Replace the placeholder text with your actual details.
Format:
Contributor details
Your Expensify account email: <REPLACE EMAIL HERE>
Upwork Profile Link: <REPLACE LINK HERE>
Actually i am not sure if this is a bug, seems like the current behaviour is the expected result in the tests written in this PR https://github.com/Expensify/expensify-common/pull/654/files
I says even in the comments.
Actually i am not sure if this is a bug, seems like the current behaviour is the expected result in the tests written in this PR https://github.com/Expensify/expensify-common/pull/654/files
I believe it was an oversight while developing that feature. Slack, Github and other major apps preserve the indentation and we should as well.
I think @quinthar would agree with this
Proposal
Please re-state the problem that we are trying to solve in this issue
Indented text comment inside quotes is formatted without indentation once posted.
What is the root cause of that problem?
Issue https://github.com/Expensify/App/issues/17504 and expensify-common #521 PR introduced the formatTextForQuote function which uses trim() twice when formatting each quote line within the modifyTextForQuote function.
The issue with this is that regardless of the quote line containing indent (empty spaces) + text or just indentation without any text -> the trimming always removes the indentation.
What changes do you think we should make in order to solve the problem?
[!note] From the functionality implemented by the PR mentioned in RCA we can note that the
formatTextForQuotefunction is usingtrim()twice:
row.substr(4).trim()to remove any empty spaces after>(quote symbol) is removed.textToFormat.trim()to remove leading / trailing line breaks ("\n") after map andjoin('\n')method.This means that it is paramount for us to make sure we keep the
trim()for empty quote lines while we remove trimming for quote lines which are not empty, containing any of:
- indent + text (our issue)
- indent + text + indent
- text
To implement this we require changes in two parts of the logic:
- Within the
modifyTextForQuotefunction where we check if the line contains quote here -> we change the logic to:
if (Str.startsWith(textSplitted[i], '>') && !insideCodefence) {
// we remove ">" (quote symbol) and trim
const textContent = textSplitted[i].substring(4).trim();
// if trimmed text content is empty return it, otherwise return original quote line
const formattedLine = !textContent ? textContent : textSplitted[i];
// append each quote line to textToFormat (similar to old code)
textToFormat += `${formattedLine}\n`;
// textToFormat += `${textSplitted[i]}\n`; <- removed old code line
}
by doing this we keep trimming of the empty quote lines.
- Within the
formatTextForQuotefunction, we only remove the firsttrim()instance. This ensures that for quote lines which are not empty (have text or indent + text) we keep the indentation.
[!caution] It is important that we keep the second
trim()since that is not part of our issue's RCA and it was put there for a good reason: removal of leading / trailing line breaks ("\n") after map andjoin('\n')method.
Videos
MacOS: Chrome / Safari
| Before | After |
|---|---|
Taking this as C+
@abzokhattab Thanks for the proposal. Your RCA is not correct. Disabling the white spaces rules is not something we want here and the way it's fixing this bug is only a side effect.
@tienifr Thanks for the proposal. Your RCA is correct. However the solution seems to have issues with nested quotes (and other failing tests) e.g.
> > > Hello my
> > beautiful
> world
| Expected | Received |
|---|---|
@ikevin127 Thanks for the proposal. Your RCA is about right, though the linked PR is not the offending PR (the trim functionality was already there). I didn't go deep into analyzing your solution but it seems to have the same issue as in @tienifr's solution - bug outlined in the above comment ^
Proposal updated to address @s77rt 's concern
@mananjadhav, @laurenreidexpensify Huh... This is 4 days overdue. Who can take care of this?
@abzokhattab Thanks for the update. The RCA is correct. Regarding the solution, the failing tests should be addressed (not all of them need to be updated - some are actually bugs) e.g. sending > hello should not have a leading space
Expected:
Received:
@tienifr Thanks for the update. I found this edge case failing, not sure if it's worth addressing
> # h1
> b
> c
Regardless the solution seems a little complicated (adding 3 more regex), can't we just remove the trim function and investigate why the other tests are failing?
Just a note here: I included the heading1 in the trimming regex because on GitHub, you can have up to 4 spaces before the # and it will still be rendered as a heading, with the space being trimmed. However, if you have 5 spaces or more, it will not be rendered as a heading.
Example:
> # hello
is rendered as:
hello
and
> # hello
is rendered as:
# hello
@tienifr Thanks for the update. I found this edge case failing, not sure if it's worth addressing
@s77rt I think it not be worth since the if the user set the heading "normally" (without redundant indentation before it), it still works fine, also in some other apps like Github, the # will also not show as heading if there are many spaces before it, so the rule regarding this isn't exactly universal. But if we feel we want to address it, we can do so in the PR.
Regardless the solution seems a little complicated (adding 3 more regex), can't we just remove the trim function and investigate why the other tests are failing?
It looks like it adds more Regex but actually the Regexes are every simple to understand, we can combine them to 2 Regex (or replace Regex with normal JS but that won't be as clean), and if we look at the final composition of formatTextForQuote after the fix, it looks very clean and understandable as well.
Proposal
Please re-state the problem that we are trying to solve in this issue.
Indented text in a quote is not showing correctly
What is the root cause of that problem?
- We trim leading and trailing spaces on every line in the quote. here
- We remove quote block start and end spaces here, while what we want to do is only remove leading and trailing line breaks.
What changes do you think we should make in order to solve the problem?
Change line
let textToFormat = textToCheck.split('\n').map(row => row.substr(4).trim()).join('\n');
to:
let textToFormat = textToCheck.split('\n').map(row => row.slice(row[4] == " " ? 5 : 4).replace(/\s*(>|#)/, '$1').trimEnd()).join('\n');
and change line
textToFormat = textToFormat.trim();
to
textToFormat = textToFormat.replace(/^\n+/,'').replace(/\n+$/, '');
What alternative solutions did you explore? (Optional)
Change line
let textToFormat = textToCheck.split('\n').map(row => row.substr(4).trim()).join('\n');
to:
let textToFormat = textToCheck.split('\n').map(row => row.slice(4).replace(/\s*(?=>|#)/, '').trimEnd()).join('\n');
and change line
textToFormat = textToFormat.trim();
to
textToFormat = textToFormat.replace(/^\n+|\n+$/g, '');
A little explanation(details) about my proposal:
-
row => row.substr(4).trim()causing leading spaces and trailing spaces of all lines in a quote be removed, which is not actually what we need, since causing the issue - I believe what we really need is to trim all trailing spaces while remove as least leading spaces as possible to make the number of remaining leading spaces is multiple of 4 (let's say we want 4 spaces as an indentation, it can be 2 of course)
e.g.
14 leading spaces --> 12 leading spaces,13 leading spaces --> 12 leading spaces,2 leading spaces --> 0 leading spaces,5 leading spaces --> 4 leading spaces,
row => {
let ret = row.substr(4); // get rid of > not changed
let trimed = ret.trimStart(); // to check how many leading spaces
const num = (ret.length - trimed.length) % 4; // calculate how many leading spaces should we remove
ret = ret.slice(num).trimEnd() // remove `count number` leading spaces and all trailing spaces
return ret;
}
this code block is just a implementation of point 2
4. substr method is deprecated based on mdn doc, so I recommend change it to slice
@abzokhattab Thanks for the update. Unfortunately the solution is too complex (more regex and more loops).
@tienifr I would still prefer to avoid any additional regex. The trim was added as a feature (unwanted feature it appears). Removing that functionality should not increase the complexity.
@badeggg Thanks for the proposal. Your RCA is correct. The solution looks good to me 👍. We don't have any specific indent size, we can just limit the leading trim to 1 char at most (substr 5 at most).
let textToFormat = textToCheck.split('\n').map(row => row.substr(row[4] == " " ? 5 : 4).trimEnd()).join('\n');
🎀 👀 🎀 C+ reviewed Link to proposal
Triggered auto assignment to @Julesssss, see https://stackoverflow.com/c/expensify/questions/7972 for more details.
@Julesssss Please assign me as C+ here as well
📣 @s77rt 🎉 An offer has been automatically sent to your Upwork account for the Contributor role 🎉 Thanks for contributing to the Expensify app!
Offer link Upwork job Please accept the offer and leave a comment on the Github issue letting us know when we can expect a PR to be ready for review 🧑💻 Keep in mind: Code of Conduct | Contributing 📖