doc-gen icon indicating copy to clipboard operation
doc-gen copied to clipboard

Searching the documentation with filters

Open polibb opened this issue 4 years ago • 35 comments

Searching the mathlib library contents in the documentation and filtering of the resulting set of declarations. One is now able to:

  • search through the declaration names, the module names and the description texts that belong to any declarations of theorems, axioms, definitions, etc.
  • filter the search results before or after submitting the search query by choosing values for attributes and kind of the resulting declarations
  • selected attributes are combined with an OR. If an attribute in the list of attributes for any given declaration is found in the list submitted in the filters, then the declaration is legitimate.
  • selected kind options are combined with an OR.
  • if both any attributes are chosen and any kind options are selected, then the two sets of OR-connected values are combined with an AND. Hence, if a declaration fits the selection of one, but does not fit the possibilities allowed by the other filter, the declaration is considered illegitimate.

More information:

  1. Results are shown in prioritized sequence from most matches per declaration contents to least.
  2. Highest priority is given to the declaration name. Following is its descriptive text added as a comment in the mathlib library itself. Lastly, the module name the declaration is found in (file name) carries the least priority.
  3. Submitting the filters form at any point also re-submits the search query, if there is any.
  4. Maximum of 10 results is shown, but an option to expand and review all results is present in the UI.
  5. While the user is on any given page and searches for a term there for the first time the results will take awhile to load, but any subsequent search takes no extra time whatsoever. Thus, we recommend loading a search result once on any page and designating it for further searches, if needed, because it will save you time.
  6. Navigation with the keyboard is not possible at this point, please expect it in the near future.

polibb avatar May 14 '21 15:05 polibb

This is great! Thanks Polly!

Let's let GitHub do its thing and build it for all of us to try out:

#deploy

Or, maybe that needs to be in its own comment?

robertylewis avatar May 14 '21 15:05 robertylewis

#deploy

robertylewis avatar May 14 '21 15:05 robertylewis

Hmm, our deploy action is broken...

robertylewis avatar May 14 '21 15:05 robertylewis

It might not work for PRs opened from external forks? See e.g. https://github.com/leanprover-community/doc-gen/pull/115#issuecomment-779953415

bryangingechen avatar May 14 '21 16:05 bryangingechen

It might not work for PRs opened from external forks? See e.g. #115 (comment)

Aha, yes, that seems to be it. I think we can delete all of those checks, right? The deploy action is being run from the master branch of this repo, not from the PR, so it should have permission.

robertylewis avatar May 14 '21 16:05 robertylewis

#deploy

gebner avatar May 14 '21 17:05 gebner

This PR has been successfully deployed at http://leanprover-community.github.io/mathlib_docs_demo!

github-actions[bot] avatar May 14 '21 17:05 github-actions[bot]

Are there some good examples we should try in the demo? I tried applying only the "simp" filter and then searching "nat" but the search seems to be stuck on octopuses forever.

bryangingechen avatar May 14 '21 17:05 bryangingechen

Hmm, checking my browser console, I also see a message " TypeError: text is undefined" coming from "const indexedData = req.response;" in searchWorker.js line 16. I'm using Firefox 88.0.1 on macOS if that matters.

bryangingechen avatar May 14 '21 17:05 bryangingechen

Hmm, that search works for me on Chrome, so it might be browser specific?

robertylewis avatar May 14 '21 19:05 robertylewis

It also works for me in Firefox (87.0 on Ubuntu). The octopi are a different color than in Chrome!

robertylewis avatar May 14 '21 20:05 robertylewis

However, the search doesn't work at all for me on Android (Chrome 90.0). The query just disappears when I click search. Can anyone else duplicate?

robertylewis avatar May 14 '21 20:05 robertylewis

Hi @bryangingechen :) thanks for testing the demo! Seeing " TypeError: text is undefined" is very weird because there's not references to "text" anymore anywhere, that was part of the previous implementation. On the other hand the "const indexedData = req.response;" is in fact part of the new implementation. Would you clean up and delete app of your app cache, cache, form data, indexed DB, and local storage for the domain of "https://leanprover-community.github.io/" and let me know if the problem persists, please? In the meantime I'm looking for ways to try it on macOS as well, I wouldn't be surprised if there has to be a slight tweak in the implementation for different browsers indeed.

Good demo searches I use are "sa" with and without filter "nolint", "la" with and without "theorem", and "a" to see thousands of results once it has started loading freely and quickly after the first search.

@robertylewis The octopi is so much better in Firefox indeed, the whole UI is in my opinion hahaha. You're right about the Android bug - it doesn't work when I try it out either. I will look into it, but I'm also noticing that the version of the search used now hangs forever on Chrome on Android at the moment, which might suggest incompatibility of some sort with the sharedWorker maybe, I'm not sure.

Thank you so much for testing, guys! I'm open to further discussions on how to make it good for everyone.

polibb avatar May 15 '21 10:05 polibb

Thanks, after clearing various caches it works! I did manage to trigger some more console errors by typing in the search box and then hitting apply filters a few times:

Uncaught (in promise) TypeError: searchResultsShowAllBtn is null submitSearchFormHandler https://leanprover-community.github.io/mathlib_docs_demo/nav.js:176 submitFiltersFormHandler https://leanprover-community.github.io/mathlib_docs_demo/nav.js:352

I'll play around with it more before giving more thoughts though!

bryangingechen avatar May 15 '21 13:05 bryangingechen

UPDATE: Fixes pushed, please pull to see them.

  • handled console errors for missing elements and trying to access methods on them
  • loading added when going from basic (max 10) results set to "Show all" when rendering the results on the DOM itself takes too long and might crash the site is abused

The script that has been used for almost a year and is running in the background for the search to function in this PR too is being ran by a SharedWorker which is not compatible with most browsers on a phone and Safari and IE on a desktop. This is not ideal, I understand, but will be improved in the future as it is not part of the scope at this time for this PR.

polibb avatar May 15 '21 14:05 polibb

#deploy

bryangingechen avatar May 15 '21 14:05 bryangingechen

This PR has been successfully deployed at http://leanprover-community.github.io/mathlib_docs_demo!

github-actions[bot] avatar May 15 '21 14:05 github-actions[bot]

#deploy

gebner avatar May 17 '21 07:05 gebner

This PR has been successfully deployed at http://leanprover-community.github.io/mathlib_docs_demo!

github-actions[bot] avatar May 17 '21 07:05 github-actions[bot]

#deploy

bryangingechen avatar Jun 08 '21 20:06 bryangingechen

This PR has been successfully deployed at http://leanprover-community.github.io/mathlib_docs_demo!

github-actions[bot] avatar Jun 08 '21 21:06 github-actions[bot]

One other thing that would be great to have before this goes live is an easily accessible / easy-to-find page describing all the search features. Maybe an extra section on the mostly-blank index page https://leanprover-community.github.io/mathlib_docs_demo/ will do?

bryangingechen avatar Jun 09 '21 01:06 bryangingechen

I think the only blocking thing here is the size of the json file as Eric points out. Was the reason for using a bmp before compression or caching? If we zip the json file when we generate it and unzip locally, I guess browsers won't cache the unzipped version, right?

For @bryangingechen 's suggestion, I think that's a good place to describe the search features and we could probably use some text from this PR.

robertylewis avatar Jun 29 '21 17:06 robertylewis

The reason for the .bmp extension instead of .json is that github will then automatically gzip it. See https://github.com/leanprover-community/doc-gen/pull/125

gebner avatar Jun 29 '21 18:06 gebner

#deploy

gebner avatar Jun 29 '21 18:06 gebner

This PR has been successfully deployed at http://leanprover-community.github.io/mathlib_docs_demo!

github-actions[bot] avatar Jun 29 '21 19:06 github-actions[bot]

The bmp is now 1660854 bytes, and loads in half a second on my machine - I think it was 15mB before compression before, so the file extension change definitely helps.

However, the call to miniSearch.addAll(indexedData) takes 28 seconds for me, vs 6 seconds it takes on the live site.

eric-wieser avatar Jul 13 '21 15:07 eric-wieser

I've created https://github.com/polibb/doc-gen/pull/4 to briefly document the search.

I agree with Eric that there's still a slowdown, but it's not near 28 seconds for me. To me the new performance is acceptable, and I'm okay merging, what do others think?

robertylewis avatar Aug 09 '21 14:08 robertylewis

#deploy

robertylewis avatar Aug 12 '21 20:08 robertylewis

This PR has been successfully deployed at http://leanprover-community.github.io/mathlib_docs_demo!

github-actions[bot] avatar Aug 12 '21 20:08 github-actions[bot]