fix(search): clean markdown elements in search contents
Summary
Changes
- Process the content before store it into indexDB, hence we don't need parse it every times. TODO In v5+: move it as a async job instead of handing main thread too long, especially with large contents. (it looks fine in our site for now)
- Adaption to use
marked v13+with pure new renderer rewrite.- Remove every markdown element stylings.
- Remove the helper
?>!>of docsify either.
- Copied functions instead import to reduce the package size (import it will block the build min optimize since it is over 500kb).
- Hardcode the
...to matched contents as truncation surroundings. - Test cases for the changes.
Snapshot (before -> after)
Related issue, if any:
What kind of change does this PR introduce?
Bugfix
For any code change,
- [x] Related documentation has been updated, if needed
- [x] Related tests have been added or updated, if needed
Does this PR introduce a breaking change?
Yes No
Tested in the following browsers:
- [x] Chrome
- [ ] Firefox
- [ ] Safari
- [ ] Edge
The latest updates on your projects. Learn more about Vercel for Git ↗︎
| Name | Status | Preview | Comments | Updated (UTC) |
|---|---|---|---|---|
| docsify-preview | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | Sep 19, 2024 5:33am |
Thanks so much @Koooooo-7 for working on this! I've also tried the latest Preview in this thread, which I assume has this change included, and noticed two things:
-
The matching text snippet does not have ellipses (...) at the start of the text when truncated which can reduce the readers understanding of the truncation going on. For example, the fifth result returned when searching for
classdisplayederty 'classList' of null (#1527) (d6df2b8), closes... -
The highlighting of the found text is on longer happening, which may be an understandable side effect of the merged v5 style updates etc.
I hope the above helps. Paul
Hi @jhildenbiddle @paulhibbitts --- Thx for the points. I think the performance issue does need to resolve. I will sync with @sy-records after the #2464 storage layer change merged to do the new storage adaption and performance refine .
This appears to be a result of storing search data unmodified, then doing a lot of text processing every a search query is performed. Why not do the processing while retrieving the search data and store the result so we only have to do basic text matches on search queries are performed?
Make sense. I think we could store the formatted data in storage and simple format the search content to do the retrieve instead of format it every time.
@paulhibbitts
- The matching text snippet does not have ellipses (...) at the start of the text when truncated which can reduce the readers understanding of the truncation going on. For example, the fifth result returned when searching for
classdisplayederty 'classList' of null (#1527) (d6df2b8), closes...
- The highlighting of the found text is on longer happening, which may be an understandable side effect of the merged v5 style updates etc.
thx for the nice catch, notes the styling issue. 👌
- Empty ellipses (......) are being displayed when searching for items matching only a Header and no immediate content below. For example, search for "Headings" which is on the UI Kit page. If no content within ellipses perhaps do not display ellipses/content at all?
Nice catch! I didn't aware that there may have a empty search content, I will update it when there is empty content, no ... display.
---- Updated
- Should we include Markdown image paths/names? For example, search for "icon.svg"?
For now, I keep the images path and names/titles meta for searching, although we can not see it in the content directly.
Awesome @Koooooo-7 , looks good! Thank you very much 🙏🏼
There seems to be a problem with diacritics.
There seems to be a problem with diacritics.
I checked the previews behavior, it is different from v4 result since last year. Which has a pure wrong result highlight for cafe.
Current behavior in this PR more looks like a "patch" to correct search contents, but we still need figure it out when and why the search content changed.
Update:
There is a potential issue for the postContent and handlePostContent, the handlePostContent may have large size than the postContent after being formatted (e.g " to ", size up to 5 times), which makes the substring in wrong place, will get a fix on it.
Because we have the content format function changes, the behavior is still changed than v4+ .
ping @sy-records