ricecooker icon indicating copy to clipboard operation
ricecooker copied to clipboard

Need to address selenium deprecation

Open jayoshih opened this issue 7 years ago • 4 comments

Description

Getting a lot of errors using selenium + phantomjs where I can't even load the page. We'll want to update the utils/downloader.py code to use something else (possibly pyppeteer?)

jayoshih avatar Jul 07 '18 00:07 jayoshih

sample code for using pypetteer: https://github.com/howie6879/aspider/blob/master/aspider/request.py#L100:L109

Pypuppeteer uses new asyncio stuff, but we can convert these calls to sync using something like https://github.com/miyakogi/syncer#usage if needed. This might not be necessary, since I think sync code calling async code is not a problem, it's the other way around that doesn't mix.

ivanistheone avatar Sep 10 '18 05:09 ivanistheone

As a workaround for phantomjs issues, one can manually download version 2.1.1 binary and set the PHANTOMJS_PATH ENV var:

export PHANTOMJS_PATH=phantomjs-2.1.1-linux-x86_64/bin/phantomjs

+1 for replacing with pyppeteer !

I wrote some sample code to wrap the asyncio stuff in a separate helper method that can be called from sync code: https://github.com/learningequality/sushi-chef-edraak/blob/master/libpyppeteer.py and example usage https://github.com/learningequality/sushi-chef-edraak/blob/master/sushichef.py#L94-L95

ivanistheone avatar Dec 27 '19 15:12 ivanistheone

Nice! Any reason not to add libpypeteer to ricecooker?

kollivier avatar Dec 27 '19 16:12 kollivier

There is a start on this in PR #268

kollivier avatar May 14 '20 15:05 kollivier