python-boilerpipe
python-boilerpipe copied to clipboard
Python interface to Boilerpipe, Boilerplate Removal and Fulltext Extraction from HTML pages
Hey guys, First of all thanks for python-boilerpipe Trying to use Boilerpipe but can't extract properly some documents... > from boilerpipe.extract import Extractor > extractorType="DefaultExtractor" > sourceUrl = 'http://www.indiatimes.com/news/india/arvind-kejriwal-to-seek-political-sanyas-127620.html' >...
This could be pretty cool, and I see it's available in boilerpipe itself.
I am having problems importing boilerpipe: ## In [2]: from boilerpipe.extract import Extractor RuntimeError Traceback (most recent call last) in () ----> 1 from boilerpipe.extract import Extractor /usr/local/lib/python2.7/dist-packages/boilerpipe/**init**.py in ()...
Hello, thanks for this wrapper, this is wonderfully useful. Installing it took me a few minutes, maybe these notes will help someone else (and perhaps could be added to the...
I used your files to install boilerpipe with sucess:  I had previously created the JAVA_HOME variable under the (x86) Program Files. When I try to run the "from boilerpipe.extract...
Hey, Python-biolerpipe work perfectly from the console and as a script but when i trying it out with my flask application it breaks .This break when i try to instantiated...
Hi, when running this code on my Ubuntu 12.04 micro-instance: # !/usr/bin/python # import boilerpipe from boilerpipe.extract import Extractor extractor = Extractor(extractor='ArticleExtractor', url="http://europe.wsj.com/home-page") extracted_text = extractor.getText() print extracted_text extracted_html =...
Hello, Firstly thank you for python-boilerpipe. When i use wget to get the page http://www.flipkart.com/dell-xps-13-laptop-2nd-gen-ci7-4gb-256gb-ssd-win7-hp/p/itmdg387gmhzhx3m and save it on my disk and then try to open it with python-boilerpipe using...