RedditExtractor icon indicating copy to clipboard operation
RedditExtractor copied to clipboard

Login / API Credential Possibility

Open normanfalk opened this issue 6 years ago • 4 comments

Is there a possibility to scrape data with an active login or API credentials? Quarantined subreddits do not seem to work which is selfexplainatory since the call is made anonymously.

If i try to get reddit_urls for example i am left with the Error Cannot connect to the website, skipping...

Any chance to get this working?

normanfalk avatar Jan 10 '20 12:01 normanfalk

Hey, first, sorry about not responding for so long, I sort of abandoned this package until recently.

Adding this functionality would increase the complexity of this package yet I'm not sure whether it would significantly expand or improve what becomes achievable.

I'll leave this issue open and if more people vote for it or if someones wants to create a PR, then we can revisit it.

ivan-rivera avatar Aug 22 '21 16:08 ivan-rivera

Let's say I have already obtained my Bearer token. All I need to do is to inject a custom header:

Authorization: Bearer MY_TOKEN

into every http GET call ... How the heck would one do this? I am not an R person. I think url_to_json.R is making the http call, but heck if I know how! readLines() ?

peebles avatar Oct 18 '21 00:10 peebles

@peebles yes, you are right, readLines is making that call right now. If you just want to append a header to the call, then you'd need to use a different solution like httr (more info here).

ivan-rivera avatar Oct 18 '21 05:10 ivan-rivera

I would also find this feature incredibly useful. I'm currently trying to collect large datasets from just one or two Reddit posts, and the limit for unauthenticated queries limits this package to only ~500 comments. This can be done via Python and PRAW, but you've done a beautiful job of creating well formatted, ready to use tables which is much harder to do with this method for a beginner like me.

the-cucco avatar Jan 05 '23 15:01 the-cucco