yahooquery icon indicating copy to clipboard operation
yahooquery copied to clipboard

"Yahoo Finance Premium instituting recaptcha #254"

Open samirgorai opened this issue 2 years ago • 10 comments

Cause of the error might be that in username page contains multiple names with 'id=login-username' because of it was not able to login and resulting in error and providing us message:"Unable to login and/or retrieve the appropriate cookies. This is " most likely due to Yahoo Finance instituting recaptcha, which " this package does not support." i changed the method of finding the element to By.XPATH

samirgorai avatar Dec 31 '23 17:12 samirgorai

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (399284b) 93.81% compared to head (57a406e) 92.89%. Report is 1 commits behind head on master.

Files Patch % Lines
yahooquery/headless.py 0.00% 3 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #255      +/-   ##
==========================================
- Coverage   93.81%   92.89%   -0.93%     
==========================================
  Files          15       15              
  Lines        1359     1380      +21     
==========================================
+ Hits         1275     1282       +7     
- Misses         84       98      +14     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Dec 31 '23 17:12 codecov[bot]

@dpguthrie i think there is no pytest test script to test "headles.py" file? so the changes made in the file was not tested by github actions. test_cov: # Run tests and prepare coverage report $(PYTEST) --cov=./ --cov-report=xml

samirgorai avatar Jan 01 '24 15:01 samirgorai

Hello @dpguthrie @me1029134 i TESTED THE CODE with my changes https://github.com/dpguthrie/yahooquery/pull/255

import yahooquery as yq password = 'Hella@123' userEmail = '[email protected]' symbol='AAPL' while (True): query = yq.Ticker(symbol, username= userEmail, password=password) p_all_financial_data_quarter = query.p_all_financial_data(frequency='q') print(p_all_financial_data_quarter)

AND THE RESULT WAS

DevTools listening on ws://127.0.0.1:64734/devtools/browser/41a2456b-89ec-4df2-b6a5-d65774e7c308 [0102/091740.817:ERROR:command_buffer_proxy_impl.cc(127)] ContextResult::kTransientFailure: Failed to send GpuControl.CreateCommandBuffer. [0102/091745.755:ERROR:gl_utils.cc(412)] [.WebGL-0000438400E7D400]GL Driver Message (OpenGL, Performance, GL_CLOSE_PATH_NV, High): GPU stall due to ReadPixels [0102/091748.973:ERROR:gl_utils.cc(412)] [.WebGL-00004384002C0000]GL Driver Message (OpenGL, Performance, GL_CLOSE_PATH_NV, High): GPU stall due to ReadPixels {'AAPL': 'User is not subscribed to Premium or has invalid cookies'}

CAN YOU CHECK ONCE AT YOUR SETUP WITH YOUR id

samirgorai avatar Jan 02 '24 04:01 samirgorai

I don't think this has anything to do with the problem - the selector still appears to work as intended. I've created a couple videos below that show the recaptcha being shown on both the master branch and your branch. The problem is that regardless of how the user gets to the next screen (By.ID or By.XPATH), a recaptcha will still be shown, which this package has no way of getting past.

dpguthrie avatar Jan 02 '24 16:01 dpguthrie

in the below video link i am to login and not able to see the check: https://www.loom.com/share/3b2b9b5ab36741098fd265ef4c75b14e?sid=17deb46f-aa13-40aa-9e2e-8d0cf969d2a0 using the code:

""" file to test login """ from selenium.webdriver.support.ui import WebDriverWait from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.common.keys import Keys import time from bs4 import BeautifulSoup from selenium.webdriver.support import expected_conditions as EC

while(1): username="[email protected]" pasword="XXXXXXXXX" driver_path='C:\Users\samir\Web Scraping14-12-2023\geckodriver.exe' LOGIN_URL = "https://login.yahoo.com" browser = webdriver.Firefox() time.sleep(2) browser.get(LOGIN_URL) print(browser.title) browser.find_element(By.XPATH, "//input[@id='login-username']").send_keys(username) browser.find_element(By.XPATH, "//input[@id='login-signin']").click() password_element = WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.ID, "login-passwd"))) password_element.send_keys(pasword) browser.find_element(By.XPATH, "//button[@id='login-signin']").click()

time.sleep(5)

samirgorai avatar Jan 02 '24 16:01 samirgorai

If you go into chrome and navigate to finance.yahoo.com, are you already logged in? If so, will you log out and then try running your script again?

dpguthrie avatar Jan 02 '24 16:01 dpguthrie

If you go into chrome and navigate to finance.yahoo.com, are you already logged in? If so, will you log out and then try running your script again?

my script is using firefox browser and its not logged in .

samirgorai avatar Jan 02 '24 17:01 samirgorai

@dpguthrie is the i am not robot check specific for your region because i am not able to see it while logging in manually

samirgorai avatar Jan 02 '24 17:01 samirgorai

No idea. Knowing YF though there probably are some regional differences but using both your code and my code, I'm unable to get past the recaptcha.

dpguthrie avatar Jan 02 '24 17:01 dpguthrie

not sure why it is not working but i tried with chrome browser as well

from selenium.webdriver.support.ui import WebDriverWait from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.common.keys import Keys import time from bs4 import BeautifulSoup from selenium.webdriver.support import expected_conditions as EC

while(1): username="[email protected]" pasword="XXXXXXXXX" driver_path='C:\Users\samir\Web Scraping14-12-2023\geckodriver.exe' LOGIN_URL = "https://login.yahoo.com/" browser = webdriver.Firefox() time.sleep(2) browser.get(LOGIN_URL) print(browser.title) browser.find_element(By.XPATH, "//input[@id='login-username']").send_keys(username) browser.find_element(By.XPATH, "//input[@id='login-signin']").click() password_element = WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.ID, "login-passwd"))) password_element.send_keys(pasword) browser.find_element(By.XPATH, "//button[@id='login-signin']").click()

time.sleep(5)

this code is able to log into the brower and the "i am not robot " check is not coming .

samirgorai avatar Jan 02 '24 17:01 samirgorai