scholarly icon indicating copy to clipboard operation
scholarly copied to clipboard

Incosistency when searching for publications

Open NisoD opened this issue 1 year ago • 2 comments

Describe the Bug

I encountered an inconsistency when searching for publications by title. The search behavior is unpredictable: sometimes it returns results, and other times it doesn't, depending on how the session is restarted.

To Reproduce

Steps to reproduce the behavior:

  1. Attempt to search for a specific publication by title using iPython.
  2. Observe that in some cases, results are returned, while in others (after restarting the session), no results are found.

Expected Behavior

The expected behavior is to receive consistent search results every time the query is run, regardless of whether the session is restarted.

Desktop:

  • Proxy Service: FreeProxies
  • Python Version: 3.11
  • Operating System: macOS
  • Library Version: 1.5

Possible Fix

The issue might be in the _load_url function located in publication_parser.py. I suggest changing the following line:

self._rows = self._soup.find_all('div', class_='gs_r gs_or gs_scl') + self._soup.find_all('div', class_='gsc_mpat_ttl')

to:

self._rows = self._soup.select("div.gs_r.gs_or.gs_scl") + self._soup.select("div.gsc_mpat_ttl")

This should potentially improve the consistency of search results.

NisoD avatar Sep 15 '24 06:09 NisoD

Great work! This line change works for me!~ Thanks!!!

stellarkey avatar Nov 06 '24 01:11 stellarkey

Thank you @NisoD for the issue and the PR, and thank you @stellarkey for testing it out and reporting here.

I learnt something from this and I think we'd want to use select instead of find_all everywhere we have multiple attributes. I'll accept the PR but leave this issue open as a reminder to fix it.

Differences between select and find_all (for my future reference):

https://www.crummy.com/software/BeautifulSoup/bs4/doc/#searching-by-css-class https://community.dataquest.io/t/find-all-and-select-difference-beautifoulsoup/299101

arunkannawadi avatar Apr 28 '25 01:04 arunkannawadi