User data Functionality
Added User data functionality to retrieve user data. All the info is extracted except for comments. Could not finish it because getting the following message when testing:
We've limited requests for this url.
If you know why this is happening, please do let me know. Till then, u can review my code. Unittests are still incomplete.
Update: Travis CI build failed because of above issue
I am not sure why the other 4 commits are showing, u can remove them or merge them ( i think they were fixed). Main part of pull request is 51ac0f8
I'll go over this and fix it later. Thanks for the PR though.
hey karan, so the funny thing is that the all the tests passed except for the 'best' url because of the links that get expired. get_user function worked excepted for the comments parts as that still has to be implemented. i needed ur advice/suggestions for two things:
- can u help me figure out how to get pass the 'We've limited requests for this url.' issue. if that can be fixed then i can quickly fix the issue
- is there a way to run tests on Travis without doing a pull request. everytime i want to run tests on Travis i do a pull request. if that is ok with u, then ill continue to do this way. i just dont want it to be irritating that i make small changes everytime to a pull request just to run it on Travis.
Hey karan, do u think it is relevant to get all the comments submitted by a user?? the reason i am asking is because alot of the code has to be re-written to get comments submitted by a user even though the code is similar to the one in _build_comments under the Story class.
Yeah it might be a bad idea to get user comments seeing how HN does not want us to crawl those pages.
1) can u help me figure out how to get pass the 'We've limited requests for this url.' issue. if that can be fixed then i can quickly fix the issue
The best way to fix this is to run tests/queries with some interval between them. Do not run all tests at once, but instead run one module at a time.
2) is there a way to run tests on Travis without doing a pull request. everytime i want to run tests on Travis i do a pull request. if that is ok with u, then ill continue to do this way. i just dont want it to be irritating that i make small changes everytime to a pull request just to run it on Travis.
As far as I know, no. You will have to setup travis for your own fork.
Thanks for the reply. So if ur ok with not extracting comments submitted by a user, ill update this pull request so that all the other info is crawled. i know the tests pass and then u merge it.
So weird....my tests pass locally but then fail on travis. Does travis use my ip address when running the tests?? if yes then im assuming HackerNews is blocking my ip address.
No I travis does not use your IP. And the failure is not because of HN. See the trace:
======================================================================
ERROR: test_get_user (tests.test_get_user.TestGetUser)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/travis/build/karan/HackerNewsAPI/tests/test_get_user.py", line 15, in test_get_user
test_user = self.hn.get_user('karangoeluw')
File "/home/travis/build/karan/HackerNewsAPI/hn/hn.py", line 198, in get_user
table = soup.find_all('table')[2].find_all('tr')
IndexError: list index out of range
The failure is because of the “Limited request for URL” error. Its what the BeautifulSoup function returns and hence there are no table attributes.
On Mar 25, 2014, at 10:07 PM, Karan Goel [email protected] wrote:
No I travis does not use your IP. And the failure is not because of HN. See the trace:
ERROR: test_get_user (tests.test_get_user.TestGetUser)
Traceback (most recent call last): File "/home/travis/build/karan/HackerNewsAPI/tests/test_get_user.py", line 15, in test_get_user test_user = self.hn.get_user('karangoeluw') File "/home/travis/build/karan/HackerNewsAPI/hn/hn.py", line 198, in get_user table = soup.find_all('table')[2].find_all('tr') IndexError: list index out of range — Reply to this email directly or view it on GitHub.
So now there's two choices:
- We either put a sleep interval in each test, but seeing the number of tests we have (and will have), this might be a bad choice.
- We mock the http responses.
Unit testing on a live site is a bad idea, so I'll do some reading, add some mocks, and then come back to this PR. How's that sound?
Sure. Sorry i could not find a solution to this.
What do you mean by mock http responses? I can also have a look at how to do this.
On Mar 25, 2014, at 10:14 PM, Karan Goel [email protected] wrote:
So now there's two choices:
We either put a sleep interval in each test, but seeing the number of tests we have (and will have), this might be a bad choice. We mock the http responses. Unit testing on a live site is a bad idea, so I'll do some reading, add some mocks, and then come back to this PR. How's that sound?
— Reply to this email directly or view it on GitHub.
http://stackoverflow.com/questions/11399148/how-to-mock-an-http-request-in-a-unit-testing-scenario-in-python