OD2 icon indicating copy to clipboard operation
OD2 copied to clipboard

Collections ranking in search results

Open sseymore opened this issue 5 years ago • 5 comments

Descriptive summary

Creating this ticket for a future discussion about how collection records are ranked in the search results.

Expected behavior

Collection records in search results will or will not have some sort of ranking logic.

Related work

N/A

Accessibility Concerns

N/A

sseymore avatar Jan 19 '21 21:01 sseymore

In addition, we can consider calling out whole collections similarly to Primo if user searches meet a threshold for strong matches to collections metadata.

jsimic avatar Jan 21 '21 21:01 jsimic

Similar to https://github.com/OregonDigital/OD2/issues/236

sseymore avatar Feb 25 '22 18:02 sseymore

https://solr.apache.org/guide/7_6/the-dismax-query-parser.html#bq-boost-query-parameter

CGillen avatar Sep 19 '23 22:09 CGillen

POSM decided to start lightly boosting collection titles, and if successful, could move on to more collection metadata fields such as Description, where additional names and keywords may be present.

I was originally thinking of index-time boosting, since we know which titles come from Collections in the app as we index. But it looks like that was deprecated in Lucene. Though there is a document score field option. We could maybe add a value to a new field if it's a Collection, and indirectly boost - though that may be for the whole solr document.

Though if we do a boost as part of the query, for title and for anything that has a Collection type in the model field, perhaps that would work. https://cwiki.apache.org/confluence/display/solr/SolrRelevancyFAQ#SolrRelevancyFAQ-FieldBasedBoosting

Here's 2 examples for testing, probably need more:

  • Building Oregon coll comes up 2nd result: https://oregondigital.org/catalog?utf8=%E2%9C%93&search_field=all_fields&q%5B%5D=building+oregon#content
  • Gerald Williams coll is 2nd result: https://oregondigital.org/catalog?utf8=%E2%9C%93&locale=en&search_field=all_fields&q%5B%5D=gerald+williams#content

Collection should probably be the 1st result instead of the 2nd.

wickr avatar Sep 25 '23 23:09 wickr

I tried modifying an existing search results query to add in a boost but didn't have any luck: https://oregondigital.org/catalog?utf8=%E2%9C%93&search_field=all_fields&q%5B%5D=building+oregon&boost=if(termfreq(has_model_sim,%27Collection%27),5,1))

Seems like it should work, but maybe not being passed through blacklight.

Also found this to be a good explainer: https://nolanlawson.com/2012/06/02/comparing-boost-methods-in-solr/

wickr avatar Sep 26 '23 00:09 wickr