retype icon indicating copy to clipboard operation
retype copied to clipboard

Search results are quite unusual

Open xyeLz opened this issue 3 years ago • 5 comments

I've essentially been replicating my project located at https://ccsp.alukos.com to a test subdomain at https://retypetest.pages.dev. I'm finding the search results in Retype to be quite confusing compared to what I might expect. For example, in many cases, I would suspect any topics being searched for that are major topics (such as higher in the anchor hierarchy) to appear first; however, this is not the case. Additionally, I don't understand why sometimes results appear and then disappear. For example, if I search for "data dispersi" (looking for data dispersion), the first result is from a data dispersion-related bullet point contained within an anchor on a page. If I continue typing and input "data dispersio", any results relating to data dispersion disappear altogether and now all of a sudden, the only results I see are related to "data." Regardless, the first result I would expect to see (and is the first result I see when searching my GitBook project) is the actual page dedicated to data dispersion with a top-level anchor titled # Data Dispersion. Instead, I'm seeing random content first. For example, many of my pages contain "quick references" that might provide insight into what an acronym stands for. For example, a table might appear on several pages that tells you that "IRM" is "Information Rights Management." The problem with this in Retype currently is that if I search "IRM", it doesn't seem to have any way to prioritize what the most important IRM is. Instead, I see a whole bunch of random pages showing me the acronym when, in reality, I'm looking for the IRM/DRM page I created. I attached some screenshots if this helps to visualize my issue.

gitbookirm retyype-irm retype-irm gitbook-dd retype-dds retype-dd

xyeLz avatar Aug 10 '22 03:08 xyeLz

Hi @xyeLz. Thanks for the excellent report and samples. It is strange that data dispersi returns a result, but data dispersio does not, and then data dispersion returns a result again. I'm not entirely sure what's going on in there, so we are going to have to investigate.

Search indexing can sometimes produce strange results and requires regular adjustments. It appears we need to work on the indexer a little more to bring these results inline with what would be the expected results.

I'll keep this thread updated with our progress. Once the search is improved, there will no changes required on your side, other than upgrading to the latest release of Retype.

geoffreymcgill avatar Aug 10 '22 04:08 geoffreymcgill

Of course, and you're welcome! Happy to help. The only thing right now for me making Retype imperfect is the search results. My users (I suspect, since I know I do) rely heavily on the search feature to find specific terminology. To be fair, I had issues with GitBook indexing as well. Ultimately, I ended up opening a ticket with them and redesigning all of my pages around their indexing. I suspect maybe I will need to do the same here? For example, I love the idea of using collapsed panels on my glossary page as sort of a "flashcard-style" studying. It seems these panels are able to be linked to as anchors, which is awesome, but for some reason when searching they don't even appear in the results:

missing

Oddly, however, searching the text contained within the panel actually yields results:

missing2

But as you can see in the above screenshot, the "top-level" of that particular search is the top-level anchor in the page ("A") rather than the panel "header" ("Affinity").

Would definitely love some sort of fix or workaround suggestion for this before I go live! The only thing I can think to do currently is to divert from using panels and use legitimate headers (such as #### which I am currently doing on GitBook). Let me know what you think!

xyeLz avatar Aug 16 '22 14:08 xyeLz

Including the Panel titles in the search index should not be a problem. I am hoping to include this functionality within the next release of Retype, which will be v3.0. At the moment, we do not have a scheduled release date for v3.0. Still lots to do, but steady progress is being made.

geoffreymcgill avatar Aug 16 '22 19:08 geoffreymcgill

Hey Geoffrey! Just an FYI, some of the weird indexing issues is a separate issue when compared to including panel titles in search (which would be more of a feature, in my opinion). The only reason I mention this is because I saw this issue also raised: https://github.com/retypeapp/retype/issues/363 and thought you may want to include that one in the v3.0 milestone instead of this one, unless of course fixing search in general is part of the v3.0 milestone. :)

Also, I realize there is no scheduled release date for v3.0 yet, but is there any possible indication as to when this might release? I only ask since my site design currently relies heavily on panels, and my users rely heavily on searching, so obviously these two don't go together very well! I'm planning to delay the public release of the site until this is fixed so any sort of indication would be helpful, even if you said Q1 2023. Thanks!

xyeLz avatar Aug 18 '22 17:08 xyeLz

As of Retype v3.0, Panel titles are being indexed.

Work is still required to improve the actual search indexing logic, so I will leave this Issue option.

geoffreymcgill avatar Jul 07 '23 23:07 geoffreymcgill